UA CSC 453 - Top-Down Parsing - D2562239

Home> Schools> University of Arizona> Computer Science Main (CSC) > CSC 453> Top-Down Parsing

DOC PREVIEW

UA CSC 453 - Top-Down Parsing

School name University of Arizona

Course Csc 453- Compilers and Systems Software

Pages 29

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CSc 453Compilers and Systems Software7 : Top-Down Parsing IIDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergTop-Down ParsingThe parse tree is constructedFrom the topFrom left to right.The terminals are seen in order of appearance in the tokenstream (t2, t5, t6, t8, t9):1743t2t5t8t6t9Building The LL(1) Parser1Remove left recursion.2Left factor the grammar.3Construct transition diagrams for each production:ELSEIFExpr()THENStat() Stat()1Compute FIRST(A) for each grammar symbol A.2Compute FOLLOW(A) for each nonterminal A.4Simplify the transition diagrams.5Construct the recursive procedures.FIRST SetsFIRST(α) is the set of terminals that begin strings derivedfrom α.FIRST-sets help us solve problem 3.prog → decl | statstat→ if · · · | id() | while · · ·decl → int id | real idFIRST(stat) = {if, id, while}FIRST(decl) = {int, real}FIRST Sets. . .PROCEDURE prog ();IF currtok ∈ {if, id, while} THEN stat();ELSIF currtok ∈ {int, real} THEN decl()ELSE syntax error ENDIF;END;PROCEDURE stat (); · · · END;PROCEDURE decl (); · · · END;FIRST(stat) = {if, id, while}FIRST(decl) = {int, real}FIRST Sets. . .FIRST(α) is the set of terminals that begin strings derivedfrom α, ie. FIRST(α) = {a| a a terminal, α∗⇒ aβ}.Example GrammarE → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | idFIRST sets:FIRST(E) = {(, id}FIRST(T ) = {(, id}FIRST(F ) = {(, id}FIRST(E′) = {+, ǫ}FIRST(T′) = {*, ǫ}Computing FIRST(A)REPEAT until no more changes:1.IF A is a terminal THEN FIRST(A) = {A}.2.IF A is a nonterminal, and thereis a production A → ǫ THEN ǫ is in FIRST(A).3.IF A is a nonterminal, and thereis a production A → Y1· · · YkTHENFOR i := 1 to k − 1 DOIF ǫ ∈ FIRST(Y1) ∧ · · · ∧ ǫ ∈ FIRST(Yi)and a (6= ǫ) ∈ FIRST(Yi +1) THEN a is in FIRST(A);IF ǫ ∈ FIRST(Y1) ∧ · · · ∧ ǫ ∈ FIRST(Yk) THENǫ is in FIRST(A);E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id1FIRST(+) = {+}, FIRST(*) = {*}, etc., because of point 1.2ǫ ∈ FIRST(E′), ǫ ∈ FIRST(T′), because of point 2.3(, id ∈ FIRST(F ), because of point 3.4(, id ∈ FIRST(T ), because (, id ∈ FIRST(F ), and T → F T′.5(, id ∈ FIRST(E ), because (, id ∈ FIRST(T ), and E → T E′.6+ ∈ FIRST(E′), * ∈ FIRST(T′), because E′→ + T E′andT′→ * F T′.FOLLOW SetsWe let $ symbolize end-of-input.FOLLOW(A) is the set of terminals that can follow right afterthe nonterminal A in some sentential form.FOLLOW(A) = {a| a a terminal, S∗⇒ αAaβ}.$ ∈ FOLLOW(A) if A is the rightmost symbol in a sententialform, i.e. S∗⇒ αA.FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | idFOLLOW(E) = {), $}FOLLOW(T ) = {+, ), $}FOLLOW(F ) = {+, *, ), $}FOLLOW(F ) = {+, *, ), $}FOLLOW(E′) = {), $}}FOLLOW(T′) = {+, ), $}}FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id) is in FOLLOW(E), becauseE ⇒ TE′⇒ FT′E′⇒ (E ) T′E′.+ is in FOLLOW(T ), becauseE ⇒ TE′⇒ T + TE′.FOLLOW Sets. . .E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id* is in FOLLOW(F ), becauseE ⇒ TE′⇒ FT′E′⇒ F * FT′E′.) is in FOLLOW(F ), becauseE ⇒ TE′⇒ FT′E′⇒ (E )T′E′⇒(TE′)T′E′⇒ (FT′E′)T′E′⇒(FE′)T′E′⇒ (F ) T′E′.Computing FOLLOW SetsLet S be the start symbol and $ the end-of-file marker.REPEAT until no more changes:1. Add $to FOLLOW(S).2. IF there is a production A → αBβ THENAdd everything in FIRST(β) (except ǫ) to FOLLOW(B).3. IF there is a production A → αB ORa production A → αBβ where ǫ ∈ FIRST(β) THENAdd everything in FOLLOW(A) to FOLLOW(B).Computing FOLLOW Sets. . .BBFIRST(β) ⊆ FOLLOW(B)FOLLOW(A) ⊆ FOLLOW(B)ββǫSAALL(1) GrammarsA grammar is LL(1) if we can construct a recursive descentparser that handles it (without using backtracking).LL(1) stands forThe input is scanned from L eft-to-right.The parse produces a L eftmost derivation.We have 1 -token lookahead.LL(1) Grammars. . .Formal Definition:A grammar is LL(1) iff for any two productionsA → α | βthe following conditions hold1FIRST(α) ∩ FIRST(β) = ∅2If β∗⇒ ǫ then FIRST(α) ∩ FOLLOW(A) = ∅Recursive Descent ParsersFOR each non-terminal A DOcreate initial and final states;FOR each production A → X1· · · XnDOcreate a path A’s initial to A’s final nodewith edges labeled X1· · · Xn:· · ·XnX2X1;Simplify the diagrams;FOR each transition diagram P DOCreate a procedure P that ‘‘traverses’’ the diagramguided by the input.Recursive Descent Parsers. . .non−terminalatokens2Brs1PROCEDURE P();IF currtok = a THEN curr tok := next token()ELSE syntaxerror; ENDIFIF currtok ∈ FIRST(s1) THEN code for parsing s1ELSIF currtok ∈ FIRST(s2) THEN code for parsing s2ELSE syntaxerror; ENDIFB();WHILE curr tok ∈ FIRST(r) DO code for parsing r ENDDOEND P;E → T E′E′→ +T E′| ǫT → F T′T′→ *F T′| ǫF → ( E ) | id*( )id+T E′F T′EE :E′:T :T′:F :ǫT E′F T′ǫ+( )id*E′:F T′T E′E :TǫǫEF :T′:ǫǫFT :( )id+*FF T′T E′E :EF :ǫTT :E′:T′:ǫ( )id+*FEF :ǫTǫFT :E :T( )id+*ǫEF :T :E :TFǫ+ǫE :TPROCEDURE E();LOOPT();IF curtok = + THENcur tok := next token()ELSE EXIT ENDIFENDLOOP;*ǫT :FPROCEDURE T();LOOPF();IF curtok = * THENcurtok := next token()ELSE EXIT ENDIFENDLOOP;( )idF :EPROCEDURE F();IF curtok = ( THENcurtok := next token();E();IF curtok = ) THENcurtok := next token()ELSE syntax error ENDIFELSIF cur tok = id THENcur tok := next token();ELSE syntax error ENDIFParsing ExpressionsHere’s the parser coded in Java.class ExprH {static String[] input = {"i","+","i","*","i","$"};static int current = 0;static boolean lookahead(String S) {return input[current].equals(S);}static void match(String S) throws Exception {if (input[current].equals(S)) current++;else throw new Exception();}static void E() throws Exception {T(); while(lookahead("+")) {match("+"); T();}}static void T() throws Exception {F(); while(lookahead("*")) {match("*"); F();}}static void F() throws Exception {if (lookahead("(")) {match("("); E(); match(")");} else match("i");}public static void main(String[] args) {try {E(); System.out.println("true");} catch (Exception e){System.out.println("false");}}Readings and ReferencesRead Louden, pp. 143–196.Or, the Dragon Book:Top-Down Parsing 181–190Error Recovery

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-27-28-29 out of 29 pages.

UA CSC 453 - Top-Down Parsing

Sign up for free to view:

Please select your school