DOC PREVIEW
MSU ECE 480 - C++ Text Searching Algorithms

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

C++ Text Searching Algorithms Developing powerful and efficient text searching techniques in C++ Trieu Nguyen ECE 480 Design Team 1 4/2/2010Developing powerful and efficient text searching techniques in C++ Page | 2 Table of Contents ABSTRACT 3 KEYWORDS 3 OBJECTIVE 4 INTRODUCTION 4 TEXT PARSING 6 TEXT SEARCHING 8 A. SIMPLE SEARCHING 8 B. PARTIAL SEARCHING 9 C. COMPLEX SEARCHING 11 CONCLUSION 14 REFERENCES 14Developing powerful and efficient text searching techniques in C++ Page | 3 Abstract Text parsing and searching are problems which programmers are always passively solving. Most user level applications will at some point, will require at least some basic form of text manipulation. Even if this is only a small part of the overall solution, it can often be very time consuming and frustrating. It is often common practice to develop an ad hoc solution for every application of use in order to save time, but creating a generic solution will always help out more in the future. The C++ standard library has several classes which can often help with simple functionality; however, not all of these functionalities are intuitive and can cause ambiguity for simple problems. Keywords C++, String, Search, Text, Algorithm, Tokenize, Programming, ParseDeveloping powerful and efficient text searching techniques in C++ Objective This application note will cover several simple parsing. This document will take several searching algorithms. Not only are these algorithms simple and powerful, they were created to solve a more generalmodifications. Introduction First we will consider a simple C++ character array:“This is an array.” Now let’s see what this looks like from a As you can see, the example is ratherarray indexing if we want to modify any of the letters in our array.remove all the whitespaces? This may require a little more workthis particular example. We can accomplish this by constructing a new array in which every time we run into a whitespace, we shift all the contents to the left. be too difficult. How about matching two the problem becomes much more complex.both arrays and using a boolean operator to compare every index.versatile technique. The reason for results if the second array is slightly different. Developing powerful and efficient text searching techniques in C++This application note will cover several simple algorithms which deal with text searching and take several incremental steps to achieving successfulNot only are these algorithms simple and powerful, but most importantly, a more general problem and can be reused with minor First we will consider a simple C++ character array: Now let’s see what this looks like from a coding perspective:As you can see, the example is rather simple and intuitive. It’s easy to see that we can just use array indexing if we want to modify any of the letters in our array. What if you wanted to This may require a little more work, but is still pretty simple by constructing a new array in which every time we run into a whitespace, we shift all the contents to the left. Programming a procedure to do this wouldn’t matching two arrays or searching for a word within an arraythe problem becomes much more complex. You can match two arrays by iterating throughoolean operator to compare every index. However, this is not a very The reason for this is because this technique may produce undesirable array is slightly different. Developing powerful and efficient text searching techniques in C++ Page | 4 algorithms which deal with text searching and successful text but most importantly, and can be reused with minor It’s easy to see that we can just use What if you wanted to still pretty simple for by constructing a new array in which every time we run into a Programming a procedure to do this wouldn’t for a word within an array? Now by iterating through However, this is not a very this is because this technique may produce undesirableDeveloping powerful and efficient text searching techniques in C++ These two arrays are similar except for one minor difference, the first element in the array is a lower case ‘t’ and the corresponding index in the a boolean operator on this index would return a false because the decimal representation of ‘t’ is 116 and for ‘T’ is 84. The problem becomes even more particular word within an array. cases where if an index matches ‘i’, then the word is found if the succeeding index matches ‘s’.For example: We can already see a problem with this technique.part of another word. The algorithm would have to be modifiedproceeding and succeeding indexes are whitespaces. This won’t work if the word is at the end or the beginning of the array though. something much more complex. Developing powerful and efficient text searching techniques in C++These two arrays are similar except for one minor difference, the first element in the array is a lower case ‘t’ and the corresponding index in the first array is an upper case ‘T’. Using oolean operator on this index would return a false because the decimal representation of ‘t’ The problem becomes even more complex if we want to search for a particular word within an array. If you wanted to search for the word “is”, you could cases where if an index matches ‘i’, then the word is found if the succeeding index matches ‘s’.can already see a problem with this technique. The first result is not a word by itself. It. The algorithm would have to be modified to also verify that the indexes are whitespaces. This won’t work if the word is at the end or the beginning of the array though. All these edge cases are what turn simple problems into . Developing powerful and efficient text searching techniques in C++ Page | 5 These two arrays are similar except for one minor difference, the first element in the second upper case ‘T’. Using oolean operator on this index would return a false because the decimal representation of ‘t’ complex if we want to search for a If you wanted to search for the word “is”, you could find all the cases where if an index matches ‘i’, then the word is found if the succeeding index matches ‘s’. The first result is not a word by itself. It is a to also verify that the indexes are whitespaces. This won’t work if the word is at the end simple problems intoDeveloping powerful and efficient text searching


View Full Document

MSU ECE 480 - C++ Text Searching Algorithms

Documents in this Course
ganttv1

ganttv1

6 pages

sd97

sd97

17 pages

ap_EO

ap_EO

14 pages

Load more
Download C++ Text Searching Algorithms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view C++ Text Searching Algorithms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view C++ Text Searching Algorithms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?