Parsing — Part II(Top-down parsing, left-recursion removal)Parsing TechniquesTop-down parsers (LL(1), recursive descent)•Start at the root of the parse tree and grow toward leaves•Pick a production & try to match the input•Bad “pick” ⇒ may need to backtrack•Some grammars are backtrack-free (predictive parsing)Bottom-up parsers (LR(1), operator precedence)•Start at the leaves and grow toward root•As input is consumed, encode possibilities in an internal state•Start in a state valid for legal first tokens•Bottom-up parsers handle a large class of grammarsA top-down parser starts with the root of the parse treeThe root node is labeled with the goal symbol of the grammarTop-down parsing algorithm:Construct the root node of the parse tree Repeat until the fringe of the parse tree matches the input string1At a node labeled A, select a production with A on its lhs and, for each symbol on its rhs, construct the appropriate child2When a terminal symbol is added to the fringe and it doesn’t match the fringe, backtrack3Find the next node to be expanded (label ∈ NT)•The key is picking the right production in step 1→That choice should be guided by the input stringTop-down ParsingRemember the expression grammar?And the input x – 2 * y Version with precedence derived last lecture1 Goal → Expr 2 Expr → Expr + Term 3 | Expr – Term 4 | Term 5 Term → Term * Factor 6 | Term / Factor 7 | Factor 8 Factor → number 9 | idLet’s try x – 2 * y :ExampleGoalExprTerm+ExprTermFact.<id,x>Leftmost derivation, choose productions in an order that exposes problemsRule Sentential Form Input — Goal ↑x – 2 * y 1 Expr ↑x – 2 * y 2 Expr + Term ↑x – 2 * y 4 Term + Term ↑x – 2 * y 7 Factor + Term ↑x – 2 * y 9 <id,x> + Term ↑x – 2 * y 9 <id,x> + Term x ↑– 2 * yLet’s try x – 2 * y :This worked well, except that “–” doesn’t match “+”The parser must backtrack to hereExampleGoalExprTerm+ExprTermFact.<id,x>Rule Sentential Form Input — Goal ↑x – 2 * y 1 Expr ↑x – 2 * y 2 Expr + Term ↑x – 2 * y 4 Term + Term ↑x – 2 * y 7 Factor + Term ↑x – 2 * y 9 <id,x> + Term ↑x – 2 * y 9 <id,x> + Term x ↑– 2 * yExampleContinuing with x – 2 * y :GoalExprTerm–ExprTermFact.<id,x>Rule Sentential Form Input — Goal ↑x – 2 * y 1 Expr ↑x – 2 * y 3 Expr – Term ↑x – 2 * y 4 Term – Term ↑x – 2 * y 7 Factor – Term ↑x – 2 * y 9 <id,x> – Term ↑x – 2 * y 9 <id,x> – Term x ↑– 2 * y — <id,x> – Term x – ↑2 * yExampleContinuing with x – 2 * y :GoalExprTerm–ExprTermFact.<id,x>We can advance past “–” to look at “2”This time, “–” and “–” matched⇒ Now, we need to expand Term - the last NT on the fringeRule Sentential Form Input — Goal ↑x – 2 * y 1 Expr ↑x – 2 * y 3 Expr – Term ↑x – 2 * y 4 Term – Term ↑x – 2 * y 7 Factor – Term ↑x – 2 * y 9 <id,x> – Term ↑x – 2 * y 9 <id,x> – Term x ↑– 2 * y — <id,x> – Term x – ↑2 * yExampleTrying to match the “2” in x – 2 * y :GoalExprTerm–ExprTermFact.<id,x>Fact.<num,2>Rule Sentential Form Input — <id,x> – Term x – ↑2 * y 7 <id,x> – Factor x – ↑2 * y 9 <id,x> – <num,2> x – ↑2 * y — <id,x> – <num,2> x – 2 ↑* yExampleTrying to match the “2” in x – 2 * y :Where are we?•“2” matches “2”•We have more input, but no NTs left to expand•The expansion terminated too soon⇒Need to backtrackGoalExprTerm-ExprTermFact.<id,x>Fact.<num,2>Rule Sentential Form Input — <id,x> – Term x – ↑2 * y 7 <id,x> – Factor x – ↑2 * y 9 <id,x> – <num,2> x – ↑2 * y — <id,x> – <num,2> x – 2 ↑* yExampleTrying again with “2” in x – 2 * y :This time, we matched & consumed all the input⇒Success!GoalExprTerm–ExprTermFact.<id,x>Fact.<id,y>TermFact.<num,2>*Rule Sentential Form Input — <id,x> – Term x – ↑2 * y 5 <id,x> – Term * Factor x – ↑2 * y 7 <id,x> – Factor * Factor x – ↑2 * y 8 <id,x> – <num,2> * Factor x – ↑2 * y — <id,x> – <num,2> * Factor x – 2 ↑* y — <id,x> – <num,2> * Factor x – 2 * ↑y 9 <id,x> – <num,2> * <id,y> x – 2 * ↑y — <id,x> – <num,2> * <id,y> x – 2 * y↑Other choices for expansion are possibleThis doesn’t terminate (obviously)•Wrong choice of expansion leads to non-termination•Non-termination is a bad property for a parser to have•Parser must make the right choiceAnother possible parseconsuming no input !Rule Sentential Form Input — Goal ↑x – 2 * y 1 Expr ↑x – 2 * y 2 Expr + Term ↑x – 2 * y 2 Expr + Term +Term ↑x – 2 * y 2 Expr + Term + Term +Term ↑x – 2 * y 2 Expr +Term + Term + …+Term ↑x – 2 * yLeft RecursionTop-down parsers cannot handle left-recursive grammarsFormally,A grammar is left recursive if ∃ A ∈ NT such that ∃ a derivation A ⇒+ Aα, for some string α ∈ (NT ∪ T )+Our expression grammar is left recursive•This can lead to non-termination in a top-down parser•For a top-down parser, any recursion must be right recursion•We would like to convert the left recursion to right recursionNon-termination is a bad property in any part of a compilerEliminating Left RecursionTo remove left recursion, we can transform the grammarConsider a grammar fragment of the formFee → Fee α | βwhere neither α nor β start with FeeWe can rewrite this as Fee → β FieFie → α Fie | εwhere Fie is a new non-terminalThis accepts the same language, but uses only right recursionEliminating Left RecursionThe expression grammar contains two cases of left recursionApplying the transformation yieldsThese fragments use only right recursion They retain the original left associativityExpr → Expr + Term | Expr – Term | Term Term → Term * Factor | Term / Factor | Factor Expr → Term Expr′ Expr′ | + Term Expr′ | – Term Expr′ | ε Term → Factor Term′ Term′ | * Factor Term′ | / Factor Term′ | εEliminating Left RecursionSubstituting them back into the grammar yields• This grammar is correct, if somewhat non-intuitive.• It is left associative, as was the original• A top-down parser will terminate using it.• A top-down parser may need to backtrack with it.1 Goal → Expr 2 Expr → Term
View Full Document