DOC PREVIEW
UA CSC 453 - Top-Down Parsing

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSc 453Compilers and Systems Software7 : Top-Down Parsing IIDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergTop-Down ParsingThe parse tree is constructedFrom the topFrom left to right.The terminals are seen in order of appearance in the tokenstream (t2, t5, t6, t8, t9):1743t2t5t8t6t9Building The LL(1) Parser1Remove left recursion.2Left factor the grammar.3Construct transition diagrams for each production:ELSEIFExpr()THENStat() Stat()1Compute FIRST(A) for each grammar symbol A.2Compute FOLLOW(A) for each nonterminal A.4Simplify the transition diagrams.5Construct the recursive procedures.FIRST SetsFIRST(α) is the set of terminals that begin strings derivedfrom α.FIRST-sets help us solve problem 3.prog → decl | statstat→ if · · · | id() | while · · ·decl → int id | real idFIRST(stat) = {if, id, while}FIRST(decl) = {int, real}FIRST Sets. . .PROCEDURE prog ();IF currtok ∈ {if, id, while} THEN stat();ELSIF currtok ∈ {int, real} THEN decl()ELSE syntax error ENDIF;END;PROCEDURE stat (); · · · END;PROCEDURE decl (); · · · END;FIRST(stat) = {if, id, while}FIRST(decl) = {int, real}FIRST Sets. . .FIRST(α) is the set of terminals that begin strings derivedfrom α, ie. FIRST(α) = {a| a a terminal, α∗⇒ aβ}.Example GrammarE → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | idFIRST sets:FIRST(E) = {(, id}FIRST(T ) = {(, id}FIRST(F ) = {(, id}FIRST(E′) = {+, ǫ}FIRST(T′) = {*, ǫ}Computing FIRST(A)REPEAT until no more changes:1.IF A is a terminal THEN FIRST(A) = {A}.2.IF A is a nonterminal, and thereis a production A → ǫ THEN ǫ is in FIRST(A).3.IF A is a nonterminal, and thereis a production A → Y1· · · YkTHENFOR i := 1 to k − 1 DOIF ǫ ∈ FIRST(Y1) ∧ · · · ∧ ǫ ∈ FIRST(Yi)and a (6= ǫ) ∈ FIRST(Yi +1) THEN a is in FIRST(A);IF ǫ ∈ FIRST(Y1) ∧ · · · ∧ ǫ ∈ FIRST(Yk) THENǫ is in FIRST(A);E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id1FIRST(+) = {+}, FIRST(*) = {*}, etc., because of point 1.2ǫ ∈ FIRST(E′), ǫ ∈ FIRST(T′), because of point 2.3(, id ∈ FIRST(F ), because of point 3.4(, id ∈ FIRST(T ), because (, id ∈ FIRST(F ), and T → F T′.5(, id ∈ FIRST(E ), because (, id ∈ FIRST(T ), and E → T E′.6+ ∈ FIRST(E′), * ∈ FIRST(T′), because E′→ + T E′andT′→ * F T′.FOLLOW SetsWe let $ symbolize end-of-input.FOLLOW(A) is the set of terminals that can follow right afterthe nonterminal A in some sentential form.FOLLOW(A) = {a| a a terminal, S∗⇒ αAaβ}.$ ∈ FOLLOW(A) if A is the rightmost symbol in a sententialform, i.e. S∗⇒ αA.FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | idFOLLOW(E) = {), $}FOLLOW(T ) = {+, ), $}FOLLOW(F ) = {+, *, ), $}FOLLOW(F ) = {+, *, ), $}FOLLOW(E′) = {), $}}FOLLOW(T′) = {+, ), $}}FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id) is in FOLLOW(E), becauseE ⇒ TE′⇒ FT′E′⇒ (E ) T′E′.+ is in FOLLOW(T ), becauseE ⇒ TE′⇒ T + TE′.FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id* is in FOLLOW(F ), becauseE ⇒ TE′⇒ FT′E′⇒ F * FT′E′.) is in FOLLOW(F ), becauseE ⇒ TE′⇒ FT′E′⇒ (E )T′E′⇒(TE′)T′E′⇒ (FT′E′)T′E′⇒(FE′)T′E′⇒ (F ) T′E′.Computing FOLLOW SetsLet S be the start symbol and $ the end-of-file marker.REPEAT until no more changes:1. Add $to FOLLOW(S).2. IF there is a production A → αBβ THENAdd everything in FIRST(β) (except ǫ) to FOLLOW(B).3. IF there is a production A → αB ORa production A → αBβ where ǫ ∈ FIRST(β) THENAdd everything in FOLLOW(A) to FOLLOW(B).Computing FOLLOW Sets. . .BBFIRST(β) ⊆ FOLLOW(B)FOLLOW(A) ⊆ FOLLOW(B)ββǫSAALL(1) GrammarsA grammar is LL(1) if we can construct a recursive descentparser that handles it (without using backtracking).LL(1) stands forThe input is scanned from L eft-to-right.The parse produces a L eftmost derivation.We have 1 -token lookahead.LL(1) Grammars. . .Formal Definition:A grammar is LL(1) iff for any two productionsA → α | βthe following conditions hold1FIRST(α) ∩ FIRST(β) = ∅2If β∗⇒ ǫ then FIRST(α) ∩ FOLLOW(A) = ∅Recursive Descent ParsersFOR each non-terminal A DOcreate initial and final states;FOR each production A → X1· · · XnDOcreate a path A’s initial to A’s final nodewith edges labeled X1· · · Xn:· · ·XnX2X1;Simplify the diagrams;FOR each transition diagram P DOCreate a procedure P that ‘‘traverses’’ the diagramguided by the input.Recursive Descent Parsers. . .non−terminalatokens2Brs1PROCEDURE P();IF currtok = a THEN curr tok := next token()ELSE syntaxerror; ENDIFIF currtok ∈ FIRST(s1) THEN code for parsing s1ELSIF currtok ∈ FIRST(s2) THEN code for parsing s2ELSE syntaxerror; ENDIFB();WHILE curr tok ∈ FIRST(r) DO code for parsing r ENDDOEND P;E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id*( )id+T E′F T′EE :E′:T :T′:F :ǫT E′F T′ǫ+( )id*E′:F T′T E′E :TǫǫEF :T′:ǫǫFT :( )id+*FF T′T E′E :EF :ǫTT :E′:T′:ǫ( )id+*FEF :ǫTǫFT :E :T( )id+*ǫEF :T :E :TFǫ+ǫE :TPROCEDURE E();LOOPT();IF curtok = + THENcur tok := next token()ELSE EXIT ENDIFENDLOOP;*ǫT :FPROCEDURE T();LOOPF();IF curtok = * THENcurtok := next token()ELSE EXIT ENDIFENDLOOP;( )idF :EPROCEDURE F();IF curtok = ( THENcurtok := next token();E();IF curtok = ) THENcurtok := next token()ELSE syntax error ENDIFELSIF cur tok = id THENcur tok := next token();ELSE syntax error ENDIFParsing ExpressionsHere’s the parser coded in Java.class ExprH {static String[] input = {"i","+","i","*","i","$"};static int current = 0;static boolean lookahead(String S) {return input[current].equals(S);}static void match(String S) throws Exception {if (input[current].equals(S)) current++;else throw new Exception();}static void E() throws Exception {T(); while(lookahead("+")) {match("+"); T();}}static void T() throws Exception {F(); while(lookahead("*")) {match("*"); F();}}static void F() throws Exception {if (lookahead("(")) {match("("); E(); match(")");} else match("i");}public static void main(String[] args) {try {E(); System.out.println("true");} catch (Exception e){System.out.println("false");}}Readings and ReferencesRead Louden, pp. 143–196.Or, the Dragon Book:Top-Down Parsing 181–190Error Recovery


View Full Document

UA CSC 453 - Top-Down Parsing

Download Top-Down Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Top-Down Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Top-Down Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?