Unformatted text preview:

Lecture Outline Implementation of parsers Two approaches Top Down Parsing Top down Bottom up Top Down Adapted from Lecture by Profs Alex Aiken George Necula UCB CS780 Prasad Easier to understand and program manually Bottom Up More powerful and used by most parser generators L101TDP 1 Intro to Top Down Parsing CS780 Prasad L101TDP 2 Consider the grammar E T E T T E int int T 1 t2 Terminals are seen in order of appearance in the token stream t2 t5 t6 t8 t9 L101TDP Recursive Descent Parsing The parse tree is constructed From the top From left to right CS780 Prasad 3 4 t5 t9 7 t6 Token stream is int5 int2 Start with top level non terminal E Try the rules for E in order t8 3 CS780 Prasad L101TDP 4 1 Recursive Descent Parsing Example Cont Recursive Descent Parsing Example Cont Try E0 T1 E2 Then try a rule for T1 E3 Try E0 T1 Follow same steps as before for T1 But does not match input token int5 And succeed with T1 int T2 and T2 int With the following parse tree E0 Try T1 int Token matches But after T1 does not match input token Try T1 int T2 T1 This will match but after T1 will be unmatched int5 Has exhausted the choices for T1 Backtrack to choice for E0 CS780 Prasad L101TDP T2 int2 5 CS780 Prasad L101TDP 6 A Recursive Descent Parser Preliminaries A Recursive Descent Parser 2 Let TOKEN be the type of tokens Define boolean functions that check the token string for a match of Special tokens INT OPEN CLOSE PLUS TIMES Let the global next point to the next token A given token terminal bool term TOKEN tok return next tok A given production of S the nth bool Sn Any production of S bool S These functions advance next CS780 Prasad L101TDP 7 CS780 Prasad L101TDP 8 2 A Recursive Descent Parser 3 A Recursive Descent Parser 4 For production E T E Functions for non terminal T bool E1 return bool T1 return term OPEN E term CLOSE bool T2 return term INT term TIMES T bool T3 return term INT T term PLUS E For production E T bool E2 return T For all productions of E with backtracking bool T TOKEN save next return next save T1 next save T2 next save T3 bool E TOKEN save next return next save E1 next save E2 CS780 Prasad L101TDP 9 CS780 Prasad L101TDP Recursive Descent Parsing Notes When Recursive Descent Does Not Work To start the parser Consider a production S S a Initialize next to point to first token Invoke E bool S1 return bool S return Notice how this simulates our previous example L101TDP S term a S1 S will get into an infinite loop A left recursive grammar has a non terminal S Easy to implement by hand But does not always work CS780 Prasad 10 S S for some Recursive descent does not work in such cases 11 CS780 Prasad L101TDP 12 3 Elimination of Left Recursion More Elimination of Left Recursion Consider the left recursive grammar In general S S S generates all strings starting with a and followed by a number of Can rewrite using right recursion S S 1 S n 1 m All strings derived from S start with one of 1 m and continue with several instances of 1 n Rewrite as S 1 S m S S 1 S n S S S S S CS780 Prasad L101TDP 13 CS780 Prasad 14 L101TDP A Bb a General Left Recursion B Aa b The grammar Cf Gaussian Elimination S A A S is also left recursive because A Bb a B Bb a a b S S A Bb a B aa b Z aa b A Bb a This left recursion can also be eliminated More examples on the following slides CS780 Prasad L101TDP B Bba aa b 15 CS780 Prasad Z baZ ba L101TDP 16 4 Example Related to conversion to Griebach Normal Formal urs io n A BC B CA b C AB a lef t tin g min a El i A BC B CA b C BC B a r ec Af B f C C CAC B bC B a C bCB a R bCB a R ACBR ACB Simple and general parsing strategy Introducing terminals as first element on RHS Unpopular because of backtracking Left recursion must be eliminated first but that can be done automatically Thought to be too inefficient Cf Prolog execution strategy C bCBR aR bCB a B bcBRA aRA bCBA aA b A bcBRAC aRAC bCBAC aAC bC In practice backtracking is eliminated by restricting the grammar To enable look before you leap strategy R bCBRAC bC CBR CB CS780 Prasad Summary of Recursive Descent L101TDP 17 CS780 Prasad Predictive Parsers Like recursive descent but parser can predict which production to use L means left to right scan of input R means rightmost derivation k means predict based on k tokens of lookahead Predictive parsers accept LL k grammars L means left to right scan of input L means leftmost derivation k means predict based on k tokens of lookahead In practice LL 1 is used L101TDP 18 LL k grammars LR k grammars By looking at the next few tokens No backtracking CS780 Prasad L101TDP 19 RL 1 grammars R means right to left scan of input LR 0 LR 1 grammars SLR 1 grammars LALR 1 grammars CS780 Prasad L101TDP 20 5 LL 1 Languages Predictive Parsing and Left Factoring In recursive descent for each non terminal and input token there may be a choice of production LL 1 means that for each non terminal and token there is only one production Can be specified via 2D tables Recall the grammar One dimension for current non terminal to expand One dimension for next token A table entry contains one production CS780 Prasad L101TDP 21 E T E T T int int T E Hard to predict because For T two productions start with int For E it is not clear how to predict A grammar must be left factored before use for predictive parsing CS780 Prasad 22 L101TDP Left Factoring Example LL 1 Parsing Table Example Recall the grammar Left factored grammar E T E T T int int T E E TX T E int Y Factor out common prefixes of productions possibly introducing productions E TX X E T E int Y Y T CS780 Prasad The LL 1 parsing table int E T E CS780 Prasad TX int Y Y 23 TX X L101TDP X E Y T E T L101TDP 24 6 LL 1 Parsing Table Example Cont LL 1 Parsing Tables Errors Consider the E int entry Blank entries indicate error situations When current non terminal is E and next input is int use production E T X This production can generate an int in the first place Consider the E entry There is no way to derive a string starting with from non terminal E Consider the Y entry When current non terminal is Y and current …


View Full Document

Wright CS 780 - Top-Down Parsing

Loading Unlocking...
Login

Join to view Top-Down Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Top-Down Parsing and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?