Unformatted text preview:

Outline Regular languages revisited Parser overview Context free Grammars Context free grammars CFGs Adapted from material by Derivations Prof Alex Aiken and Prof George Necula UCB Ambiguity CS780 Prasad L6CFG 1 Regularity CS780 Prasad L6CFG 2 Parsing Example Languages requiring counting modulo a fixed number are regular Finite automaton cannot count without limit or remember number of times it has visited a particular state Many languages are not regular E g Language of balanced parentheses is not regular Cool if x y then 1 else 2 fi Parser input IF ID ID THEN INT ELSE INT FI Parser output IF THEN ELSE INT i 0 INT i i CS780 Prasad L6CFG ID 3 CS780 Prasad ID L6CFG 4 1 Context Free Grammars Input to and output of a parser Programming language constructs have recursive structure An EXPR is Input 123 3 23 6 minus op left paren op num 123 3 plus op num 23 6 right paren op if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool Context free grammars are a natural notation for this recursive structure Iteration Regular Expression Tail Recursion Regular Grammar General Recursion Context free Grammar Parse Tree Syntax Analyzer Parser Token Stream CS780 Prasad 123 3 23 6 L6CFG 5 CS780 Prasad L6CFG 6 CFG N T P S N Finite set of variables non terminals T Alphabet Finite set of terminals P Finite set of rules productions S Start symbol S N N T Rule A A N N T CS780 Prasad L6CFG a represents a context free language because we can write a CFG for it S a S S aS S ContextContext freeness An A rule can be applied whenever A occurs in a string irrespective of the context that is non terminals and terminals around A Cf context sensitive grammar declare use 7 CS780 Prasad L6CFG 8 2 Examples of CFGs A fragment of Cool Balanced Parenthesis Grammar EXPR if EXPR then EXPR else EXPR fi while EXPR loop EXPR pool S S S Simple arithmetic expressions id E E E Non terminals are written in upper case Terminals are in lower case E E The start symbol is the left hand side of the first production E id CS780 Prasad L6CFG 9 CS780 Prasad L6CFG 10 From CFG to Language One step Derivation Example Hierarchical Scope Procedure foo integer m integer n integer j for i 1 to n do if i j j j 1 m i j for k i to n m m k uAv u v A w is derivable from v in CFG if there is a finite sequence of rule applications such that v w1 wn w Balanced Parentheses Problem Example CS780 Prasad L6CFG 11 CS780 Prasad L6CFG 12 3 Let G N P S be a CFG w N T is a sentential form if S G w w T is a sentence if S G w The language of G L G w T S G w Cool Example Some terminal strings in the Cool language id if id then id else id fi while id loop id pool G S a b S S aSb S if while id loop id pool then id else id if if id then id else id fi then id else id fi L G a n bn n 0 CS780 Prasad L6CFG 13 CS780 Prasad L6CFG Notes on Parser Derivations and Parse Trees Parser checks the membership in a language constructs a parse tree for the input Parser must handle errors gracefully Parser generators implement CFG s e g bison A derivation is a sequence of production applications A derivation can be drawn as a tree Form of the grammar is important Many grammars generate the same language Tools are sensitive to the grammar CS780 Prasad L6CFG 15 14 Start symbol is the tree s root X Y1 LYn For a production Y1 LYn add children to parent node X CS780 Prasad L6CFG 16 4 Derivation Example Derivation E Grammar E E E E E E id String id E E id id id CS780 Prasad L6CFG E E E E E E 17 Derivation in Detail 1 id id E E id id id id CS780 Prasad E E id id L6CFG 18 Derivation in Detail 2 E E E E E CS780 Prasad E E E E L6CFG 19 CS780 Prasad L6CFG 20 5 Derivation in Detail 3 Derivation in Detail 4 E E E E E E CS780 Prasad E E E E E E E E E E id E E L6CFG 21 Derivation in Detail 5 id E E id id E CS780 Prasad E E id L6CFG E id L6CFG 22 E E E E E E E E Derivation in Detail 6 E E E CS780 Prasad E E E E E E E E E id E E E id id E id id id id 23 CS780 Prasad E E id L6CFG E E id id 24 6 Notes on Derivations Left most and Right most Derivations A parse tree has The previous example is a left most derivation Terminals at the leaves Non terminals at the interior nodes L6CFG E E E id At each step replace the left most nonterminal An in order traversal of the leaves is the original input The parse tree shows the association of operations the input string does not CS780 Prasad E E E id E id id There is an equivalent notion of a right most derivation 25 Right most Derivation in Detail 1 CS780 Prasad id id id L6CFG 26 Right most Derivation in Detail 2 E E E CS780 Prasad E E E E E L6CFG 27 CS780 Prasad L6CFG 28 7 Right most Derivation in Detail 3 Right most Derivation in Detail 4 E E E E E E E E id L6CFG 29 E E E E E E id id E E E E E id id E E id E id id id L6CFG CS780 Prasad id L6CFG 30 E E E CS780 Prasad Right most Derivation in Detail 6 E E E id E E E E id Right most Derivation in Detail 5 E id E E E id E id CS780 Prasad E id id id 31 CS780 Prasad E E id L6CFG E E id id 32 8 Summary of Derivations Ambiguity Note that right most and left most derivations have the same parse tree the difference is the order in which the branches are added We are not just interested in whether s L G This string has two parse trees E E E E E E id id E E id id id id We need a parse tree for s A derivation defines a parse tree but one parse tree may have many associated derivations Left most and right most derivations are important in parser implementation CS780 Prasad L6CFG 33 E CS780 Prasad L6CFG E 34 A grammar is ambiguous if it has more than one parse tree for a string Ambiguity is BAD because it leaves meaning of some programs ill defined Ambiguity can be avoided by rewriting the …


View Full Document

Wright CS 780 - Context-free Grammars

Loading Unlocking...
Login

Join to view Context-free Grammars and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Context-free Grammars and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?