CS 172: Computability and ComplexityMinimization of DFAsSanjit A. SeshiaEECS, UC BerkeleyAcknowledgments: L.von Ahn, L. Blum, M. Blum, A. SinclairS. A. Seshia 2What is Minimization?Minimized DFA for language L=DFA with fewest states that recognizes LAlso called minimal DFAS. A. Seshia 3Why is Minimization Important?DFAs are how computers manipulate regular languages (expressions)DFA size determines space/time efficiencyS. A. Seshia 4IS THIS MINIMAL????11110000NOS. A. Seshia 5HOW ABOUT THIS?0101YESS. A. Seshia 6EquivalentDFAs010111110000S. A. Seshia 7Main Result of this LectureFor every regular language L, there exists a unique, minimal DFAthat recognizes L• uniqueness up to re-labeling of statesS. A. Seshia 8Words States• Let DFA M = (Q, Σ, δ, q0, F)• Each word w in Σ*corresponds to a unique state in Q– The ending state of M on w • Given x, y ∈∈∈∈ Σ*– x ∼∼∼∼My iffM ends in the same state on both x and y– ∼∼∼∼Mis an equivalence relation (why?)– How many equivalence classes are there?S. A. Seshia 9Example: Is 1 ∼∼∼∼M11? 10 ∼∼∼∼M00?11110000S. A. Seshia 10Indistinguishable Words/Strings• Let DFA M = (Q, Σ, δ, q0, F) recognize L• Given x, y ∈∈∈∈ Σ*– x ∼∼∼∼Ly (x and y are indistinguishable) iff∀∀∀∀ z ∈∈∈∈ ΣΣΣΣ*, xz ∈∈∈∈ L iff yz ∈∈∈∈ LCompare with– x ∼∼∼∼My iffM ends in the same state on both x and yS. A. Seshia 11Example: What are indistinguishable words?11110000S. A. Seshia 12∼Land ∼M• Let DFA M = (Q, Σ, δ, q0, F) recognize L• Given x, y ∈∈∈∈ Σ*– x ∼∼∼∼Ly (x and y are indistinguishable) iff∀∀∀∀ z ∈∈∈∈ ΣΣΣΣ*, xz ∈∈∈∈ L iff yz ∈∈∈∈ L– x ∼∼∼∼My iffM ends in the same state on both x and y– True or False:• If x ∼∼∼∼My then x ∼∼∼∼Ly • If x ∼∼∼∼Ly then x ∼∼∼∼MyTRUEFALSES. A. Seshia 13Indistinguishable Words• Let DFA M = (Q, Σ, δ, q0, F) recognize L• Given x, y ∈∈∈∈ Σ*– x ∼∼∼∼Ly (x and y are indistinguishable) iff∀∀∀∀ z ∈∈∈∈ ΣΣΣΣ*, xz ∈∈∈∈ L iff yz ∈∈∈∈ L– x ∼∼∼∼My iffM ends in the same state on both x and yWhich has more equivalence classes --∼∼∼∼Mor ∼∼∼∼L?S. A. Seshia 14Myhill-NerodeTheorem (a version)The relation ∼∼∼∼Ldefines a DFA MLfor language L where the states of MLcorrespond to the equivalence classes of ∼∼∼∼LMLis the unique, minimal DFA for L(up to isomorphism)S. A. Seshia 15Proof of Myhill-Nerode Thm.S. A. Seshia 16Next: Algorithm for DFA MinimizationS. A. Seshia 17Indistinguishable States• Idea: Merge “indistinguishable states”• Recall:– States of DFA M map 1-1 to equivalence classes of ∼∼∼∼M– Each equivalence class of ∼∼∼∼Mis in some equivalence class of ∼∼∼∼L• States p and q are indistinguishable ifftheir corresponding ∼∼∼∼Mequivalence classes are in the same class of ∼∼∼∼L– We write p ∼∼∼∼ q – p ~ q “p and q are distinguishable”S. A. Seshia 18Input: DFA MOutput: DFA ML such that:M ≡≡≡≡ MLML has no unreachable statesML is irreduciblestates of MLare pairwise distinguishable||Theorem: MLis the unique minimumThe Algorithm We WantS. A. Seshia 19DFA Minimization Algo.: Idea• States of MLare equivalence classes of ∼∼∼∼L• Equivalence classes of ∼∼∼∼Lcan be obtained by merging states of M• Our algorithm works in reverse:– Start by assuming all states as being merged together– Identify pairs of distinguishable states • Repeat until no new distinguishable state-pairs identifiedS. A. Seshia 20TABLE-FILLING ALGORITHMInput: DFA M = (Q, Σ, δδδδ, q0, F) Output:States of ML= { [q] | q ∈∈∈∈ Q }Table: { (p,q) | p,q ∈∈∈∈ Q and p ~ q }/q0q1qiqnq0q1qiqnBase Case: p accepts and q rejects ⇒⇒⇒⇒ p ~ q/d d d ddddRecursion:pp′′′′qq′′′′~/σσσσσσσσ⇒⇒⇒⇒p ~ q/S. A. Seshia 21101010,10q0q0q1q1q2q2q3q3q0q1q2q3d d ddddExampleS. A. Seshia 2211110000q0q0q1q1q2q2q3q3d dddq0q1q2q3S. A. Seshia 23101010,10q0q1q3q5q401q20,1q0q0q1q1q3q3q4q4q5q5Do Try this at Home!S. A. Seshia 24Correctness of Algorithm1.If algorithm marks (p, q) as “d”, then p ~ q2.If p ~ q, then algorithm marks (p, q) as “d”Proving (1) is easy. Use induction on the step at which (p, q) was marked “d”.S. A. Seshia 25Part (2): If p ~ q, then the algorithm marks (p, q) as “d”Of all such “bad pairs” (p, q), let p, q be a pair with the smallest |w| Proof (by contradiction):Suppose p ~ q, but the algorithm does not mark (p, q) as “d”δδδδ(p, w) ∈∈∈∈ F and δδδδ(q, w) ∉∉∉∉ F ^ ^Since p ~ q there exists w such that:S. A. Seshia 26If p ~ q, then the algorithm marks (p, q) as “d”Proof (by contradiction):Suppose p ~ q, but the algorithm does not mark (p, q) as “d”δδδδ(p, w) ∈∈∈∈ F and δδδδ(q, w) ∉∉∉∉ F ^ ^Of all such “bad pairs” (p, q), let p, q be a pair with the smallest |w| w = σσσσw′′′′, where σσσσ ∈∈∈∈ Σ (w is not εεεε, why?)Let p′′′′ = δδδδ(p,σσσσ) and q′′′′ = δδδδ(q,σσσσ)Then (p′′′′, q′′′′) is also a bad pairContradiction! (why?)S. A. Seshia 27Complexity of Algorithm• For DFA M, let– Number of states of M be n– Size of the input alphabet Σ be k• Initialization of table: O(n2)• Rest of the algorithm: O(k n2)S. A. Seshia 28Minimal NFA is NOT Unique0000S. A. Seshia 29Next Steps• Read Sipser 2.1 in preparation for next
View Full Document