CS 636 Internetworking CS#636#Internetworking#ROUTER#ALGORITHMICS#Lecture#20#Regular#Expression#Searches#1Regular#expression#matching#• DeterminisFc#and#non‐determinisFc#finite#automata#(DFAs#and#NFAs)#can#match#regular#expressions#– NFAs#more#compact#but#require#backtracking#or#keeping#track#of#sets#of#states#during#matching#– Both#representaFons#used#in#hardware#and#soRware#soluFons,#but#only#DFA#based#soluFons#can#guarantee#throughput#in#soRware#• DFAs#have#a#state#space#explosion#problem#– From#DFAs#recognizing#individual#signatures#we#can#build#a#DFA#that#recognizes#enFre#signature#set#in#a#single#pass#– Size#of#combined#DFA#much#larger#than#sum#of#sizes#for#DFAs#recognizing#individual#signatures#– MulFple#combined#DFAs#are#used#to#match#signature#set##2 CS 636 InternetworkingRepresenFng#DFAs# How to represent DFAs more compactly? » Can’t reduce number of states » How about reducing number of transitions? – 256 transitions per state – 50+ distinct transitions per state (real world datasets) – Need at least 50+ words per state Three rules a+, b+c, c*d+ 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 4 transitions per state Look at state pairs: there are many common transitions. How to remove them?IntroducFon#to#Our#Approach# How to represent DFAs more compactly? » Can’t reduce number of states » How about reducing number of transitions? – 256 transitions per state – 50+ distinct transitions per state (real world datasets) – Need at least 50+ words per state Three rules a+, b+c, c*d+ 1 3 a a a b b 2 5 4 c b b c d d d c 4 transitions per state Alternative Representation d c a b d c a 1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a Fewer transitions, less memoryD2FA#OperaFon#1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a 1 3 a 2 5 4 c c b d Input stream: a b d DFA and D2FA visits the same accepting state after consuming a character Heavy edges are called default transitions Take default transitions, whenever, a labeled transition is missing DFA D2FAD2FA#OperaFon#1 3 a a a b b 2 5 4 c b b c d d d c d c a b d c a 1 3 a 2 5 4 c c b d Any set of default transitions will suffice if there are no cycles of default transitions Thus, we need to construct trees of default transitions So, how to construct space efficient D2FAs? while keeping default paths bounded 2 1 3 4 d c b 2 1 3 4 c b d a 5 5 a c c Above two set of default transitions trees are also correct However, we may traverse 2 default transitions to consume a character Thus, we need to do more work => lower performanceD2FA#ConstrucFon# Present systematic approach to construct D2FA Begin with a state minimized DFA Construct space reduction graph » Undirected graph, vertices are states of DFA » Edges exist between vertices with common transitions » Weight of an edge = # of common transitions - 1 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3 3D2FA#ConstrucFon# Convert certain edges into default transitions » A default transition reduces w transitions (w = wt. of edge) » If we pick high weight edges => more space reduction » Find maximum weight spanning forest » Tree edges becomes the default transitions Problem: spanning tree may have very large diameter » Longer default paths => lower performance 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3 3 # of transitions removed = 2+3+3+3=11 rootD2FA#ConstrucFon# We need to construct bounded diameter trees » NP-hard » Small diameter bound leads to low trees weight – Less space efficient D2FA » Time-space trade-off We propose heuristic algorithm based upon Kruskal’s algorithm to create compact bounded diameter D2FAs 2 1 3 b 4 5 a d a c a b d a c b c b b a c d d d c 2 1 3 4 5 3 3 3 2 3 2 2 2 3
View Full Document