1Bottom-Up Parsing Algorithms LR(k) parsing L: scan input Left to right R: produce Rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead SLR Simple LR: like LR(0), but uses FOLLOW sets to build more “precise” parsing tables LR(0) is a toy, so we focus on SLR Reading: Section 4.7Problem: when to shift, when to reduce? Recall our favorite grammar:E → T + E | TT → int * T | int | (E) The stepT * int + int → int * int + intis not part of any rightmost derivation Hence, reducing first int to T was a mistakeHow to know when to reduce and when to shift?What we need for LR parsing LR(0) states describe states in which the parser can be Note: LR(0) states are used by both LR(0) and SLR parsers Parsing tables transitions between LR(0) states, actions to take when transiting: shift, reduce, accept, error How to construct LR(0) states? How to construct parsing tables? How to drive the parser?LR(0) state = set of LR(0) items An LR(0) item [X → α . β] says that the parser is looking for an X it has an α on top of the stack expects to find input string derived from β Notes: [X → α . aβ] means that if a is on the input, it can be shifted (resulting in αa. β). That is:ais a correct token to see on the input, and shifting awould not “over-shift” (still a viable prefix). [X →α.] means that we could reduce α to XT → int. * TT → int.5LR(0) statesET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TNaïve SLR Parsing Algorithm1. Let M be LR(0) state machine for G• each state contains a set I of LR(0) items2. Let |x1…xn$ be initial configuration3. Repeat until configuration is S|$• Let α|ω be current configuration• Run M on current stack α• If M rejects α, report parsing error• If M accepts α, let a be next input Shift if [X →β. a γ ] ∈ Items Reduce if [X →β.] ∈Items and a ∈ Follow(α)... β | a ... Æ ... | Xa ... Report parsing error if neither applies2Notes If there is a conflict in the last step, grammar is not SLR(k) k is the amount of lookahead In practice k = 1T → int. * TT → int.5LR(0) statesET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TSLR Example1| int * int $ActionDFA Halt StateConfigurationT → int. * TT → int.5Configuration| int*int$ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TSLR Example5int | * int $shift1| int * int $ActionDFA Halt StateConfigurationT → int. * TT → int.5Configurationint | *int$ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11T3T → int. * TT → int.5Configurationint | *int$ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TSLR Example8int * | int $shift5 * not in Follow(T)int | * int $shift1| int * int $ActionDFA Halt StateConfigurationT → int. * TT → int.5Configurationint * | int $ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TT → int. * TT → int.5Configurationint * | int $ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T → int * T. 11TT → int. * TT → int.5Configurationint * | int $ET(intint*)EETint((intT+(S’ → . EE → . TE → .T + ET → .(E)T → .int * TT → .int1S’ → E . 2 E → T.E → T. + E3T → (. E)E → .TE → .T + ET → .(E)T → .int * TT → .int4E → T + . EE → .TE → .T + ET → .(E)T → .int * TT → .int67T → (E.)8T → int * .TT → .(E)T → .int * TT → .int9E → T + E.T → (E). 10T …
View Full Document