Purdue CS 63600 - Regular Expression Searches

Unformatted text preview:

CS 636 Internetworking CS#636#Internetworking#ROUTER#ALGORITHMICS#Lecture#20#Regular#Expression#Searches#1Regular#expression#matching#• DeterminisFc#and#non‐determinisFc#finite#automata#(DFAs#and#NFAs)#can#match#regular#expressions#– NFAs#more#compact#but#require#backtracking#or#keeping#track#of#sets#of#states#during#matching#– Both#representaFons#used#in#hardware#and#soRware#soluFons,#but#only#DFA#based#soluFons#can#guarantee#throughput#in#soRware#• DFAs#have#a#state#space#explosion#problem#– From#DFAs#recognizing#individual#signatures#we#can#build#a#DFA#that#recognizes#enFre#signature#set#in#a#single#pass#– Size#of#combined#DFA#much#larger#than#sum#of#sizes#for#DFAs#recognizing#individual#signatures#– MulFple#combined#DFAs#are#used#to#match#signature#set##2 CS 636 InternetworkingRepresenFng#DFAs# How to represent DFAs more compactly? » Can’t reduce number of states » How about reducing number of transitions? – 256 transitions per state – 50+ distinct transitions per state (real world datasets) – Need at least 50+ words per state Three rules a+, b+c, c*d+ 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 4 transitions per state Look at state pairs: there are many common transitions. How to remove them?IntroducFon#to#Our#Approach# How to represent DFAs more compactly? » Can’t reduce number of states » How about reducing number of transitions? – 256 transitions per state – 50+ distinct transitions per state (real world datasets) – Need at least 50+ words per state Three rules a+, b+c, c*d+ 1 3 a a a b b 2 5 4 c b b c d d d c 4 transitions per state Alternative Representation d c a b d c a 1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a Fewer transitions, less memoryD2FA#OperaFon#1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a 1 3 a 2 5 4 c c b d Input stream: a b d DFA and D2FA visits the same accepting state after consuming a character Heavy edges are called default transitions Take default transitions, whenever, a labeled transition is missing DFA D2FAD2FA#OperaFon#1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a 1 3 a 2 5 4 c c b d Any set of default transitions will suffice if there are no cycles of default transitions Thus, we need to construct trees of default transitions So, how to construct space efficient D2FAs? while keeping default paths bounded 2 1 3 4 d c b 2 1 3 4 c b d a 5 5 a c c Above two set of default transitions trees are also correct However, we may traverse 2 default transitions to consume a character Thus, we need to do more work => lower performanceD2FA#ConstrucFon# Present systematic approach to construct D2FA  Begin with a state minimized DFA  Construct space reduction graph » Undirected graph, vertices are states of DFA » Edges exist between vertices with common transitions » Weight of an edge = # of common transitions - 1 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3 3D2FA#ConstrucFon# Convert certain edges into default transitions » A default transition reduces w transitions (w = wt. of edge) » If we pick high weight edges => more space reduction » Find maximum weight spanning forest » Tree edges becomes the default transitions  Problem: spanning tree may have very large diameter » Longer default paths => lower performance 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3 3 # of transitions removed = 2+3+3+3=11 rootD2FA#ConstrucFon# We need to construct bounded diameter trees » NP-hard » Small diameter bound leads to low trees weight – Less space efficient D2FA » Time-space trade-off  We propose heuristic algorithm based upon Kruskal’s algorithm to create compact bounded diameter D2FAs 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3


View Full Document

Purdue CS 63600 - Regular Expression Searches

Download Regular Expression Searches
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Regular Expression Searches and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regular Expression Searches 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?