UB CSE 555 - String Matching - D2457488

Home> Schools> University at Buffalo, The State University of New York> Computer Science & Engineering (CSE) > CSE 555> String Matching

UB CSE 555 - String Matching

School name University at Buffalo, The State University of New York

Course Cse 555- Introduction to Pattern Recognition

Pages 16

Download Save

Unformatted text preview:

String MatchingWhether a candidate x is a factor of textDefinitions• Length[x] = |x|• |text| >> |x|• Alphabet A–Binary, A ={0,1}– Decimal, A ={0,1,2..,9}– English Letters, A ={a,b,c..,z}– DNA bases, A ={A,G,C,T}Naïve String MatchingNaïve String Matching• O(n-m+1) in the worst case: matches at very end• Random x and test means algorithm is efficient• Information from one candidate shift is not exploited in subsequent candidate shiftBoyer Moore Algorithm• Resembles Naïve String Matching• Except matching is done right-to-left• Shifts are done in larger increments• Good suffix heuristic• Bad character heuristicBoyer-MooreBad CharacterGood SuffixBoyer-Moore principle• Good Suffix heuristic and Bad Character heuristic operate independently and in parallel• After mismatch is detected, each proposes an amount by which s can be safely increased without missing a valid shift• Larger of these shifts is selectedBoyer-Moore Implementation• Good Suffix Function–G(x) is a table that for eeach suffix gives the second rightmost occurrence in x– For “estimates” s has value 2, es has value 1• Last occurrence Function– Table containing every letter of alphabet and position of its rightmost occurrence in x– For “estimates”, a=6, b=c=d=0, e=8, etc.Edit Distance• Not clear whether “abbccc” is closer to “aabbcc” or to “abbcccb”• Edit distance is the number of fundamental operations required to transform one string to the otherCost MatrixC[i, j] = cost of transforming x to yEdit Distance (excused, exhausted) = C[i = 7, j = 9] = 3String Matching with ErrorsString Matching with Don’t

View Full Document

UB CSE 555 - String Matching

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

UB CSE 555 - String Matching

Sign up for free to view:

Please select your school