Unformatted text preview:

CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1Chapter 4dChapter 4dBottom UpParsingCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 2Right Sentential Forms• Recall the definition of a derivation and a rightmost derivation.• Each of the lines is a (right) sentential form• The parsing problem is finding the correct RHS in a right-sentential form to reduce to get the previous right-sentential form in the derivation1 E -> E+T2 E -> T3 T -> T*F4 E -> F5 F -> (E)6 F -> idEE+TE+T*FE+T*idE+F*idE+id*idT+id*idF+id*idid+id*idgenerationparsingCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 3Bottom up parsing• A bottom up parser looks at a sentential form and selects a contiguous sequence of symbols that matches the RHS of a grammar rule, and replaces it with the LHS• There might be several choices, as in the sentential form E+T*F• Which one should we choose?EE+TE+T*FE+T*idE+F*idE+id*idT+id*idF+id*idid+id*id1 E -> E+T2 E -> T3 T -> T*F4 E -> F5 F -> (E)6 F -> idE + T * F1 32 4parsingCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 4Bottom up parsing• If the wrong one is chosen, it leads to failure.• E.g.: replacing E+T with E in E+T*F yields E+F, which can not be further reduced using the given grammar.•We’ll define the handle of a sentential form as the RHS that should be rewritten to yield the next sentential form in the right most derivation.errorE*FE+T*FE+T*idE+F*idE+id*idT+id*idF+id*idid+id*id1 E -> E+T2 E -> T3 T -> T*F4 E -> F5 F -> (E)6 F -> idparsingCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 5Sentential forms• Think of a sentential form as one of the entries in a derivation that begins with the start symbol and ends with a legal sentence.• So, it’s like a sentence but it may have some “unexpanded” non-terminals.• We can also think of it as a parse tree where some of the leaves are as yet unexpanded non-terminals.1 E -> E+T2 E -> T3 T -> T*F4 E -> F5 F -> (E)6 F -> idEE+TE+T*FE+T*idE+F*idE+id*idT+id*idF+id*idid+id*idgenerationparsingE+T*idFTECMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 6Handles• A handle of a sentential form is a substring α such that :– a matches the RHS of a production A -> α ; and– replacing α by the LHS A represents a step in thereverse of a rightmost derivation of s.• For this grammar, the rightmostderivation for the input abbcde isS => aABe => aAde => aAbcde => abbcde• The string aAbcde can be reduced in two ways:(1) aAbcde => aAde (using rule 2)(2) aAbcde => aAbcBe (using rule 4)• But (2) isn’t a rightmost derivation, so Abc is the only handle.• Note: the string to the right of a handle will only contain terminals (why?)1: S -> aABe2: A -> Abc 3: A -> b4: B -> da A b c d eCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 7Phrases•A phrase is a subsequence of a sentential form that is eventually “reduced” to a single non-terminal.•A simple phrase is a phrase that is reduced in a single step.• The handle is the left-most simple phrase.E+T*idFTEFor this sentential form what are the• phrases• simple phrases• handleCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 8Phrases, simple phrases and handles• Def: β is the handle of the right sentential form γ = αβw if and only if S =>*rm αAw => αβw• Def: β is a phrase of the right sentential form γ if and only if S =>* γ = α1Aα2 =>+ α1βα2• Def: β is a simple phrase of the right sentential form γif and only if S =>* γ = α1Aα2 => α1βα2• The handle of a right sentential form is its leftmost simple phrase• Given a parse tree, it is now easy to find the handle• Parsing can be thought of as handle pruningCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 9Phrases, simple phrases and handlesEE+TE+T*FE+T*idE+F*idE+id*idT+id*idF+id*idid+id*idE -> E+TE -> TT -> T*FE -> FF -> (E)F -> idCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 10On to parsing• Ok, so how do we manage when we don’t have the parse tree in front of us?• We’ll look at a shift-reduce parser, of the kind that yacc uses.• A shift-reduce parser has a queue of input tokens and an initially empty stack and takes one of four possible actions:– Accept: if the input queue is empty and the start symbol is the only thing on the stack.– Reduce: if there is a handle on the top of the stack, pop it off and replace it with the RHS– Shift: push the next input token onto the stack– Fail: if the input is empty and we can’t accept.• In general, we might have a choice of doing a shift or a reduce,or maybe in reducing using one of several rules.• The algorithm we next describe is deterministic.CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 11Shift-Reduce AlgorithmsA shift-reduce parser scans input, at each step, considers whether to:• Shift the next token to the top of the parse stack (along with some state info)• Reduce the stack by POPing several symbols off the stack (& their state info) and PUSHing the corresponding nonterminal (& state info)CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 12Shift-Reduce AlgorithmsThe stack is always of the formS1 X1 S2 X2 …Sn Xnbottomtopstate terminal ornon-terminal• A reduction step is triggered when we see the symbols corresponding to a rule’s RHS on the top of the stackS1 X1 …S5 X5 S6 T S7 * S8 FbottomtopT -> T*FS1 X1 …S5 X5 S6’ TCMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 13LR parser table• LR shift-reduce parsers can be efficiently implemented by precomputing a table to guide the processingMore on this Later . . . CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 14When to shift, when to reduce• The key problem in building a shift-reduce parser is deciding whether to shift or to reduce.– repeat: reduce if you see a handle on the top of the stack, shift otherwise– Succeed if we stop with only S on the stack and no input• A grammar may not be appropriate for a LR parser because there are conflicts which can not be resolved.• A conflict occurs when the parser cannot decide whether to:– shift or reduce the top of stack (a shift/reduce conflict), or – reduce the top of stack using one of two possible productions (a reduce/reduce conflict)• There are several varieties of LR parsers (LR(0), LR(1), SLR and LALR), with differences


View Full Document

UMBC CMSC 331 - Bottom Up Parsing

Documents in this Course
Semantics

Semantics

14 pages

Java

Java

12 pages

Java

Java

31 pages

V

V

46 pages

Semantics

Semantics

11 pages

Load more
Download Bottom Up Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bottom Up Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bottom Up Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?