DOC PREVIEW
Berkeley COMPSCI 172 - Minimizing Finite Automata

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS–172 Computability & Complexity, Fall 2004Minimizing Finite AutomataGiven a DFA M , can we find an equivalent DFA (i.e., one that recognizes the same language as M ) with theminimum possible number of states? This is a very natural question, and has important applications to theefficiency of procedures that use finite automata. In this note, we will see an efficient algorithm that solvesthis problem.To begin, we need a couple of definitions. Let M be any DFA with alphabet Σ. Then M naturally definesan equivalence relation ∼Mover Σ∗, given byx ∼My iff M ends in the same state on inputs x and y.Note that the number of equivalence classes is finite (being equal to the number of states of M).Now let L = L(M) be the language recognized by M . This language also defines a natural equivalencerelation ∼L, as follows. Call two strings x, y ∈ Σ∗indistinguishable by L if, for all z ∈ Σ∗, xz ∈ L ⇔yz ∈ L. Otherwise we say that x and y are distinguishable. Then the relation ∼Lis defined byx ∼Ly iff x and y are indistinguishable.Our first observation is that ∼Mis a refinement of ∼L; in other words, each equivalence class of ∼Miscontained inside an equivalence class of ∼L(i.e., in other words, each equivalence class of ∼Lconsists ofthe union of some equivalence classes of ∼M):Proposition 1 ∼Mis a refinement of ∼L.Proof: Suppose that x ∼My. Then, on inputs x and y, machine M ends up in the same state. But thismeans that, for any z ∈ Σ∗, on inputs xz and yz, M must also end up in the same state, and hence xz andyz are either both in L or both not in L. Thus x and y are indistinguishable, so x ∼Ly.Note that Proposition 1 implies that the number of equivalence classes of ∼Lis at most the number of statesof M, which is finite.Now we are in a position to prove our first main result, which is a version of the so-called Myhill-NerodeTheorem:Theorem 2 The relation ∼Ldefines a DFA M′for L whose states correspond to the equivalence classesof ∼L. Moreover, this is the unique minimum DFA for L (up to isomorphism).Proof: We define the DFA M′= (Q′, Σ, q′0, F′, δ′) as follows. The set of states is Q′= {[x] : x ∈ Σ∗},where [x] denotes the equivalence class of x under the relation ∼L. The initial state is q′0= [ǫ] (theequivalence class of the empty string), and the accepting states are F′= {[x] : x ∈ L} (the equivalenceclasses of all strings in L). Finally, the transition function is defined byδ′([x], a) = [xa].1We have to be careful here! We must check that δ′is well-defined, because if y is in the same equivalenceclass as x then [x] = [y], and our definition would imply that δ′([x], a) = δ′([y], a) = [ya]. So we need tocheck that [xa] = [ya], so that our definition is consistent.But this follows since if x, y are in the same equivalence class then xaz ∈ L ⇔ yaz ∈ L for any a ∈ Σ andany z ∈ Σ∗, and hence xa and ya are also in the same equivalence class, i.e., [xa] = [ya]. So our definitionis OK.Now it is obvious that the machine M′accepts L, since by our definition of δ′, on input x it will end up instate [x], and this state is accepting if and only if x ∈ L.Finally, to see that M′has the minimum possible number of states, let M be any other machine that ac-cepts L. By Proposition 1, ∼Mis a refinement of ∼L, so ∼Mhas at least as many equivalence classesas ∼L. Hence M has at least as many states as M′. And if M has the same number of states, then theequivalence relations ∼Mand ∼Lare in fact identical.The Myhill-Nerode Theorem suggests an approach to finding the minimal DFA for a given language L.Suppose we are given a DFA M for L. By the Myhill-Nerode Theorem, we can think of each state of theminimal automaton, M′, as an equivalence class of ∼L, which in turn, by Proposition 1, is a collection ofequivalence classes of ∼M, i.e., a collection of states of M. Thus, to construct M′from M , we need tomerge together states of M until no further merging is possible.When should two states of M be merged? Well, p and q should be merged iff their equivalence classesunder ∼Mbelong to the same equivalence class under ∼L. Let x, y be in the equivalence classes p, qrespectively of ∼M; i.e., on inputs x, y, M ends up in states p, q respectively. But x ∼Ly iff, for all z, thestrings xz, yz are either both in L or both not in L. Hence we should merge p and q iff, for all strings z,the computations of M on z starting in states p and q either both lead to an accepting state or both to anon-accepting state. We will call states p and q indistinguishable in this case. Otherwise we say that p and qare distinguishable; note that this means there exists a string z which takes M from state p to an acceptingstate and from state q to a non-accepting state (or vice versa).We are now ready to present our algorithm for minimizing finite automata. The input is a DFA M =(Q, Σ, q0, F, δ); the output is a minimal DFA M′that accepts the same language as M . Rather than mergingindistinguishable states of M′, the algorithm instead proceeds by first assuming that all states of M′areindistinguishable, and successively identifying pairs of states that are distinguishable.The algorithm maintains a table of all unordered pairs of states of M . Each entry is binary: it is either“marked” or “unmarked.” The algorithm also maintains, for each pair, a list of other pairs; the role of theselists will become clear in a moment. Initially all table entries are unmarked, and all lists are empty. Thealgorithm begins by marking all pairs {p, q} such that p ∈ F and q ∈ Q − F ; clearly all these pairs aredistinguishable (by the empty string). It then cycles once through all other pairs of states (in any order).For each such pair {p, q}, it considers the transitions on each letter a ∈ Σ. If the pair {δ(p, a), δ(q, a)} ismarked, then p and q are distinguishable (by the string az, where z is a string that distinguishes δ(p, a) andδ(q, a)), so the algorithm marks {p, q}; if not, then the algorithm places the pair {p, q} on a temporary listassociated with the pair {δ(p, a), δ(q, a)} — this means that, if this latter pair ever gets marked, then {p, q}will be marked also. This is ensured because, whenever the algorithm marks a pair, it proceeds to mark allpairs on the associated list (and, recursively, all pairs on their lists, etc.)2Here is the full algorithm:for {p, q} with p ∈ F


View Full Document

Berkeley COMPSCI 172 - Minimizing Finite Automata

Download Minimizing Finite Automata
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Minimizing Finite Automata and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Minimizing Finite Automata 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?