Unformatted text preview:

1CS780(Prasad) L101TDP 1Top-Down ParsingAdapted from Lecture byProfs. Alex Aiken & George Necula(UCB)CS780(Prasad) L101TDP2Lecture Outline• Implementation of parsers• Two approaches– Top-down– Bottom-up• Top-Down– Easier to understand and program manually• Bottom-Up– More powerful and used by most parser generatorsCS780(Prasad) L101TDP3Intro to Top-Down Parsing• The parse tree is constructed– From the top– From left to right• Terminals are seen in order of appearance in the token stream: t2t5t6t8t91t234t57t6t9t8CS780(Prasad) L101TDP4Recursive Descent Parsing• Consider the grammarE → T + E | TT → ( E ) | int | int * T • Token stream is: int5* int2• Start with top-level non-terminal E• Try the rules for E in order2CS780(Prasad) L101TDP5Recursive Descent Parsing. Example (Cont.)•Try E0→ T1+ E2• Then try a rule for T1 → ( E3 )–But( does not match input token int5•TryT1 → int . Token matches. –But + after T1does not match input token *•Try T1 → int * T2– This will match but + after T1will be unmatched•Has exhausted the choices for T1– Backtrack to choice for E0CS780(Prasad) L101TDP6Recursive Descent Parsing. Example (Cont.)•Try E0→ T1• Follow same steps as before for T1– And succeed with T1 → int * T2and T2 → int– With the following parse treeE0T1int5 *T2int2CS780(Prasad) L101TDP7A Recursive Descent Parser. Preliminaries• Let TOKEN be the type of tokens– Special tokens INT, OPEN, CLOSE, PLUS, TIMES• Let the global next point to the next tokenCS780(Prasad) L101TDP8A Recursive Descent Parser (2)• Define boolean functions that check the token string for a match of– A given token terminalbool term(TOKEN tok) { return *next++ == tok; }– A given production of S (the nth)bool Sn() { … }– Any production of S: bool S() { … }• These functions advance next3CS780(Prasad) L101TDP9A Recursive Descent Parser (3)• For production E → T + Ebool E1() { return T() && term(PLUS) && E(); }• For production E → Tbool E2() { return T(); }For all productions of E (with backtracking)bool E() {TOKEN *save = next;return (next = save, E1()) || (next = save, E2()); }CS780(Prasad) L101TDP10A Recursive Descent Parser (4)• Functions for non-terminal Tbool T1() { return term(OPEN) && E() && term(CLOSE); }bool T2() { return term(INT) && term(TIMES) && T(); }bool T3() { return term(INT); }bool T() {TOKEN *save = next;return (next = save, T1()) || (next = save, T2()) || (next = save, T3()); }CS780(Prasad) L101TDP11Recursive Descent Parsing. Notes.• To start the parser –Initialize next to point to first token–Invoke E()• Notice how this simulates our previous example.• Easy to implement by hand• But does not always work …CS780(Prasad) L101TDP12When Recursive Descent Does Not Work• Consider a production S → S abool S1() { return S() && term(a); } bool S() { return S1(); }•S()will get into an infinite loop• A left-recursive grammar has a non-terminal SS →+Sα for some α• Recursive descent does not work in such cases.4CS780(Prasad) L101TDP13Elimination of Left Recursion• Consider the left-recursive grammarS → S α | β• S generates all strings starting with a β and followed by a number of αβα∗• Can rewrite using right-recursionS →βS’S’ →αS’ | εCS780(Prasad) L101TDP14More Elimination of Left-Recursion• In generalS → S α1| … | S αn| β1| … | βm• All strings derived from S start with one of β1,…,βmand continue with several instances ofα1,…,αn•Rewrite asS →β1S’ | … | βmS’S’ →α1S’ | … | αn S’ | εCS780(Prasad) L101TDP15General Left Recursion•The grammar S → A α | δA → S βis also left-recursive becauseS →+S βα• This left-recursion can also be eliminated. • More examples on the following slides.CS780(Prasad) L101TDP16bAaBaBbA||→→baaBbBaBbA|)|(|→→(Cf.Gaussian Elimination)babaZZbaaZbaaBaBbA|)|(|)|(|→→→baaBbaBaBbA|||→→5CS780(Prasad) L101TDP17Example: Related to conversion to Griebach Normal FormalaABCbCABBCA||→→→CBA ffaBBCCbCABBCA||→→→aBCbBCCAC ||→ACBRACBRa|bCB | RabCBC| )|(→→Eliminating left recursionIntroducing terminals as first element on RHS)|)(|...|(||| |||| ||||CBCBRbCbCBRACRbCaACbCBACaRACbcBRACAbaAbCBAaRAbcBRABabCBaRbCBRC→→→→CS780(Prasad) L101TDP18Summary of Recursive Descent• Simple and general parsing strategy– Left-recursion must be eliminated first– … but that can be done automatically• Unpopular because of backtracking– Thought to be too inefficient– Cf. Prolog execution strategy• In practice, backtracking is eliminated by restricting the grammar– To enable “look-before-you-leap” strategyCS780(Prasad) L101TDP19Predictive Parsers• Like recursive-descent but parser can “predict” which production to use.– By looking at the next few tokens.–No backtracking. • Predictive parsers accept LL(k) grammars.–Lmeans “left-to-right” scan of input.–Lmeans “leftmost derivation”.–kmeans “predict based on k tokens of lookahead”.• In practice, LL(1) is used.CS780(Prasad) L101TDP20•LL(k) grammars•LR(k) grammars–Lmeans “left-to-right” scan of input–Rmeans “rightmost derivation”–kmeans “predict based on k tokens of lookahead”•RL(1) grammars–Rmeans “right-to-left” scan of input•LR(0) , LR(1) grammars•SLR(1) grammars, LALR(1) grammars()6CS780(Prasad) L101TDP21LL(1) Languages• In recursive-descent, for each non-terminal and input token there may be a choice of production.• LL(1) means that for each non-terminal and token there is only one production.• Can be specified via 2D tables.– One dimension for current non-terminal to expand.– One dimension for next token.– A table entry contains one production.CS780(Prasad) L101TDP22Predictive Parsing and Left Factoring• Recall the grammarE → T + E | TT → int | int * T | ( E )•Hard to predict because–For T, two productions start with int.–For E, it is not clear how to predict.•A grammar must be left-factoredbefore use for predictive parsing.CS780(Prasad) L101TDP23Left-Factoring Example• Recall the grammarE → T + E | TT → int | int * T | ( E )• Factor out common prefixes of productions, possibly introducing ε-productionsE → T XX → + E | εT → ( E ) | int YY → * T | εCS780(Prasad) L101TDP24LL(1) Parsing Table Example• Left-factored grammarE → T X


View Full Document

Wright CS 780 - Top-Down Parsing

Download Top-Down Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Top-Down Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Top-Down Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?