Lecture Notes 16 Grammars and Normal Forms 1Grammars and Normal Forms Read K & S 3.7. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no, AND if yes, describe the structure a + b * c Now it's time to worry about extracting structure (and doing so efficiently). Optimizing Context-Free Languages For regular languages: Computation = operation of FSMs. So, Optimization = Operations on FSMs: Conversion to deterministic FSMs Minimization of FSMs For context-free languages: Computation = operation of parsers. So, Optimization = Operations on languages Operations on grammars Parser design Before We Start: Operations on Grammars There are lots of ways to transform grammars so that they are more useful for a particular purpose. the basic idea: 1. Apply transformation 1 to G to get of undesirable property 1. Show that the language generated by G is unchanged. 2. Apply transformation 2 to G to get rid of undesirable property 2. Show that the language generated by G is unchanged AND that undesirable property 1 has not been reintroduced. 3. Continue until the grammar is in the desired form. Examples: • Getting rid of ε rules (nullable rules) • Getting rid of sets of rules with a common initial terminal, e.g., • A → aB, A → aC become A → aD, D → B | C • Conversion to normal formsLecture Notes 16 Grammars and Normal Forms 2Normal Forms If you want to design algorithms, it is often useful to have a limited number of input forms that you have to deal with. Normal forms are designed to do just that. Various ones have been developed for various purposes. Examples: • Clause form for logical expressions to be used in resolution theorem proving • Disjunctive normal form for database queries so that they can be entered in a query by example grid. • Various normal forms for grammars to support specific parsing techniques. Clause Form for Logical Expressions ∀x : [Roman(x) ∧ know(x, Marcus)] → [hate(x, Caesar) ∨ (∀y : ∃z : hate(y, z) → thinkcrazy(x, y))] becomes ¬Roman(x) ∨ ¬know(x, Marcus) ∨ hate(x, Caesar) ∨ ¬hate(y, z) ∨ thinkcrazy(x, z) Disjunctive Normal Form for Queries (category = fruit or category = vegetable) and (supplier = A or supplier = B) becomes (category = fruit and supplier = A) or (category = fruit and supplier = B) or (category = vegetable and supplier = A) or (category = vegetable and supplier = B) Category Supplier Price fruit A fruit B vegetable A vegetable B Normal Forms for Grammars Two of the most common are: • Chomsky Normal Form, in which all rules are of one of the following two forms: • X → a, where a ∈ Σ, or • X → BC, where B and C are nonterminals in G • Greibach Normal Form, in which all rules are of the following form: • X → a β, where a ∈ Σ and β is a (possibly empty) string of nonterminals If L is a context-free language that does not contain ε, then if G is a grammar for L, G can be rewritten into both of these normal forms.Lecture Notes 16 Grammars and Normal Forms 3What Are Normal Forms Good For? Examples: • Chomsky Normal Form: • X → a, where a ∈ Σ, or • X → BC, where B and C are nonterminals in G ♦ The branching factor is precisely 2. Tree building algorithms can take advantage of that. • Greibach Normal Form • X → a β, where a ∈ Σ and β is a (possibly empty) string of nonterminals ♦Precisely one nonterminal is generated for each rule application. This means that we can put a bound on the number of rule applications in any successful derivation. Conversion to Chomsky Normal Form Let G be a grammar for the context-free language L where ε ∉ L. We construct G', an equivalent grammar in Chomsky Normal Form by: 0. Initially, let G' = G. 1. Remove from G' all ε productions: 1.1. If there is a rule A → αBβ and B is nullable, add the rule A → αβ and delete the rule B → ε. Example: S → aA A → B | CD B → ε B → a C → BD D → b D → ε Conversion to Chomsky Normal Form 2. Remove from G' all unit productions (rules of the form A → B, where B is a nonterminal): 2.1. Remove from G' all unit productions of the form A → A. 2.2. For all nonterminals A, find all nonterminals B such that A Þ* B, A ≠ B. 2.3. Create G'' and add to it all rules in G' that are not unit productions. 2.4. For all A and B satisfying 3.2, add to G'' A → y1 | y2 | … where B → y1 | y2 | is in G". 2.5. Set G' to G''. Example: A → a A → B A → EF B → A B → CD B → C C → ab At this point, all rules whose right hand sides have length 1 are in Chomsky Normal Form.Lecture Notes 16 Grammars and Normal Forms 4Conversion to Chomsky Normal Form 3. Remove from G' all productions P whose right hand sides have length greater than 1 and include a terminal (e.g., A → aB or A → BaC): 3.1. Create a new nonterminal Ta for each terminal a in Σ. 3.2. Modify each production P by substituting Ta for each terminal a. 3.3. Add to G', for each Ta, the rule Ta → a Example: A → aB A → BaC A → BbC Ta → a Tb → b Conversion to Chomsky Normal Form 4. Remove from G' all productions P whose right hand sides have length greater than 2 (e.g., A → BCDE) 4.1. For each P of the form A → N1N2N3N4…Nn, n > 2, create new nonterminals M2, M3, … Mn-1. 4.2. Replace P with the rule A → N1M2. 4.3. Add the rules M2 → N2M3, M3 → N3M4, … Mn-1 → Nn-1Nn Example: A → BCDE (n = 4) A → BM2 M2 → C M3 M3 → DELecture Notes 17 Top Down Parsing 1 Top Down Parsing Read K & S 3.8. Read Supplementary Materials: Context-Free Languages and Pushdown Automata: Parsing, Sections 1 and 2. Do Homework 15. Parsing Two basic approaches: Top Down E Þ E Þ E E + T . + T . . id Bottom Up E E T T F F F id + id Þ id + id Þ id + id A Simple Parsing Example A simple top-down parser for arithmetic expressions, given the grammar [1] E → E + T [2] E
View Full Document