Knuth-Morris-PrattSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14FSA RepresentationKMP AlgorithmFSA Construction for KMPSlide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Princeton University • COS 226 • Algorithms and Data Structures • Spring 2003 • http://www.Princeton.EDU/~cs226Knuth-Morris-PrattReference: Chapter 19, Algorithms in C by R. Sedgewick.Addison Wesley, 1990.2Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.a a b a a aSearch Pattern3 4a a5 6a0 1a a2baccept state3Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.a a b a a aSearch Pattern3 4a a5 6a0 1a a2bbbbbbaaccept state4Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state5Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state6Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state7Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state8Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state9Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state10Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state11Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state12Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state13Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a a b a aSearch Textb a a a baccept state14Knuth-Morris-PrattKMP algorithm.Use knowledge of how search pattern repeats itself.Build FSA from pattern.Run FSA on text.3 4a a5 6a0 1a a2bbbbbbaa a b a a aSearch Patterna a b a a aSearch Textb a a a baccept state15FSA RepresentationFSA used in KMP has special property.Upon character match, go forward one state.Only need to keep track of where to go upon character mismatch.–go to state next[j] if character mismatches in state jb 0a 10 1 2 3 4 50232040536a a b a a aSearch Patternnext 0 0 2 0 0 33 4a a5 6a0 1a a2bbbbbbaaccept state16KMP AlgorithmTwo key differences from brute force.Text pointer i never backs up.Need to precompute next[] table.int kmpsearch(char p[], char t[], int next[]) { int i, j = 0; int M = strlen(p); // pattern length int N = strlen(t); // text length for (i = 0; i < N; i++) { if (t[i] == p[j]) j++; // char match else j = next[j]; // char mismatch if (j == M) return i – M + 1; // found } return N; // not found} KMP String Search17FSA Construction for KMPFSA construction for KMP.FSA builds itself!Example. Building FSA for aabaaabb.State 6. p[0..5] = aabaaa–assume you know state for p[1..5] = abaaa X = 2 –if next char is b (match): go forward 6 + 1 = 7–if next char is a (mismatch): go to state for abaaaa X + 'a' = 2–update X to state for p[1..6] = abaaab X + 'b' = 33 4a a5 6a0 1a a2bbbbbba18FSA Construction for KMPFSA construction for KMP.FSA builds itself!Example. Building FSA for aabaaabb.3 4a a5 6a0 1a a2bbbbbba7ab19FSA Construction for KMPFSA construction for KMP.FSA builds itself!Example. Building FSA for aabaaabb.State 7. p[0..6] = aabaaab–assume you know state for p[1..6] = abaaab X = 3 –if next char is b (match): go forward 7 + 1 = 8–next char is a (mismatch): go to state for abaaaba X + 'a' = 4–update X to state for p[1..7] = abaaabb X + 'b' = 03 4a a5 6a0 1a a2bbbbbba7ab20FSA Construction for KMPFSA construction for KMP.FSA builds itself!Example. Building FSA for aabaaabb.3 4a a5 6a0 1a a2bb78bbbabbaba21FSA Construction for KMPFSA construction for KMP.FSA builds itself!Crucial insight.To compute transitions for state n of FSA, suffices to have:–FSA for states 0 to n-1–state X that FSA ends up in with input p[1..n-1]To compute state X' that FSA ends up in with input p[1..n], it suffices to have:–FSA for states 0 to n-1–state X that FSA ends up in with input p[1..n-1]22FSA Construction for KMP0 1abbaXa a b a a a b bSearch Pattern pattern[1..j]j23FSA Construction for KMP0 1abb 0a 10a a b a a a b bSearch Pattern X00pattern[1..j]j next024FSA Construction for KMP0 1a a2bbb 0a 10 102a a b a a a b bSearch Pattern X0a 110pattern[1..j]j next0025FSA Construction for KMP30 1a a2bbbab 0a 10 1 20232a a b a a a b bSearch Pattern X0a b 0a 1210pattern[1..j]j next02026FSA Construction for KMP3 4a0 1a a2bbbbab 0a 10 1 2 3023204a a b a a a b bSearch Pattern X0a b 0a b a 1a 12130pattern[1..j]j next020027FSA Construction for KMP3 4a a50 1a a2bbbbbab 0a 10 1 2 3 402320405a a b a a a b bSearch Patterna b a aX20a b 0a b a 1a 121430pattern[1..j]j next0020028FSA Construction for KMP3 4a a5 6a0 1a a2bbbbbbab 0a 10 1 2 3 4 50232040536a a b a a a b bSearch Patterna b a aX2a b a a a 20a b 0a b a
View Full Document