Searching StringsThe BasicsSimple MethodSimple Method ExampleSimple Method ExampleSlide 7Eventually…Worst Case ScenarioBoyer-Moore AlgorithmBoyer-Moore ExampleSlide 12Slide 13Slide 14Worst CaseImprovementsCalculating the First TableTable 1 ExampleCalculating the Second TableTable 2 ExampleMore On Table 2Pseudo CodeHow About That Worst Case?Summary of Boyer MooreTable 1 vs. Table 2Applications to Computer ScienceReferencesHomework QuestionsBrandon OchsPattern: Smaller string that we are looking forString: Larger than the pattern; the part being searchedTraverse the string from left to rightCompare pattern to left most part of the stringIf a match is found, compare with the pattenIf no match is found shift the pattern right one unitIf we run out of space, there is no matchWe want to find the word base in this sentenceALL YOUR BASE ARE BELONG TO USBASELeft Most Character = “B”ALL YOUR BASE ARE BELONG TO US BASELeft Most Character =“L”ALL YOUR BASE ARE BELONG TO US BASELeft Most Character = “L”ALL YOUR BASE ARE BELONG TO US BASEO(m*n) where m is the length of the pattern and n is the length of the stringString: AAAAAAAAAAAAABPattern: AAABTotal of 44 comparisonsCompare to pattern from right to leftIf the string’s character being matched is not in the pattern, shift by the pattern lengthALL YOUR BASE ARE BELONG TO USBASERight Most Character = “ “Shift 4ALL YOUR BASE ARE BELONG TO US BASERight Most Character = “R”Shift 4ALL YOUR BASE ARE BELONG TO US BASERight Most Character = “S”Shift 1ALL YOUR BASE ARE BELONG TO US BASERight Most Character = “E”Match FoundString: AAAAAAAAAAAAABPattern: AAABTotal of 24 comparisonsCreate a table which contains the distance from each unique character in the pattern, starting from the right.Create a second table, which will indicate the rightmost recurrence of each possible terminal portion of the pattern.Start at the last character of the pattern and move towards the first character. If the current character is not in the table already, add it.The Shift value is its distance from the rightmost character. All other characters receive a count equal to the length of the pattern.Pattern= “Banana”Character::: Shift::: ::: A::: ::: 0::: ::: N :: ::: 1::: ::: B::: ::: 5:Other::: ::: 6Traverse the pattern and create sub-strings for each value of i less than the total length.Calculate the sub-string consisting of the last i characters preceded by a mis-match for the character before it (meaning anything but the current character).Pattern= “Banana”i Character Shift0 ~A 11 ~NA 42 ~ANA 63 ~NANA 24 ~ANANA 65 ~BANANA 6Remember, we are moving the Pattern, not the substring.Example:~ABA --ABABABABAMatch is found after 2 shiftsi<-widthwhile string remains{j <- widthif j == 0 then print “Match at” i + 1{if string(i) == pattern(j){then j <- j – 1 i <- j – 1}//ifi <- i + max (table1(string(i)), table2(j))}//if}//whileString: AAAAAAAAAAAAABPattern: AAABTable1 Table2B 0 0 ~B 1A 1 1 ~AB 4Other: 4 2 ~AAB 43 ~AAAB 4Total of 14 comparisonsPreprocesses the patternTime Complexity is “Sub-lilnear” because it doesn’t need to check every character of the string to be searched. Requires fewer than (i + width) instructionsThe algorithm improves in efficiency as the pattern becomes longerTable 1 is more useful with large alphabets and small patternsTable 2 is more useful for small alphabets and large patternsReally only useful for one thing…Searching text!!Text based searching is very important in many applications and web pagesDewdney, A.K. The (New) Turing Omnibus. New York: Henry Holt and Company, 1993.R.S. Boyer, J.S. Moore: A Fast String Searching Algorithm. Communications of the ACM, 20, 10, 762-772 (1977)1) Construct one of the two types of shift tables for the pattern “PICNIC”2) List the number of comparisons required for the simple algorithm to find the pattern “AAB” in the string
View Full Document