ParsingThe Job of a ParserProblems with Solutions So FarEasy IssuesDividing the ProcessLexical AnalysisSpecifying id with a GrammarUsing Reg Ex’s to Specify an FSMTop-Down, Depth-First ParsingSlide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Left-Recursive RulesIndirect Left RecursionUsing Lookahead and Left FactoringLL(k) GrammarsRecursive Descent ParsingLR(k) GrammarsSlide 24ParsingChapter 15The Job of a Parser•Examine a string and decide whether or not it is a syntactically well-formed member of L(G), and•If it is, assign to it a parse tree that describes its structure and thus can be used as the basis for further interpretation.Given a context-free grammar G:Problems with Solutions So Far•We want to use a natural grammar that will produce a natural parse tree. But: •decideCFLusingGrammar, requires a grammar that is in Chomsky normal form. •decideCFLusingPDA, requires a grammar that is in Greibach normal form. •We want an efficient parser. But both procedures require search and take time that grows exponentially in the length of the input string. •All either procedure does is to determine membership in L(G). It does not produce parse trees.Easy Issues•Actually building parse trees: Augment the parser with a function that builds a chunk of tree every time a rule is applied.•Using lookahead to reduce nondeterminism: It is often possible to reduce (or even eliminate) nondeterminism by allowing the parser to look ahead at the next one or more input symbols before it makes a decision about what to do.Dividing the Process•Lexical analysis: done in linear time with a DFSM•Parsing: done in, at worst O(n3) time.Lexical Analysislevel = observation - 17.5; Lexical analysis produces a stream of tokens: id = id - idSpecifying id with a Grammarid identifier | integer | float identifier letter alphanum alphanum letter alphnum | digit alphnum | integer - unsignedint | unsignedint unsignedint digit | digit unsignedint digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ….Using Reg Ex’s to Specify an FSMThere exist simple tools for building lexical analyzers. The first important such tool: LexTop-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $ FailTop-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $ Backup to:Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $Top-Down, Depth-First ParsingS NP VP $NP the N | N | ProperNounN cat | dogs | bear | girl | chocolate | rifle ProperNoun Chris | FluffyVP V | V NPV like | likes | thinks | shot | smellsInput: the cat likes chocolate $ Built, unbuilt, built againLeft-Recursive RulesE E + TE TT T FT FF (E) F id On input: id + id + id :Then:And so forth.Indirect Left RecursionS YaY Sa Y This form too can be eliminated.Using Lookahead and Left Factoring•Change the parsing algorithm so that it exploits the ability to look one symbol ahead in the input before it makes a decision about what to do next, and•Change the grammar to help the parser procrastinate decisions.Goal: Procrastinate branching as long as possible. To do that, we will:LL(k) GrammarsAn LL(k) grammar allows a predictive parser:• that scans its input Left to right • to build a Left-most derivation • if it is allowed k lookahead symbols. Every LL(k) grammar is unambiguous (because every string it generates has a unique left-most derivation). But not every unambiguous grammar is LL(k).Recursive Descent ParsingA BA | aB bB | bA(n: parse tree node labeled A) = case (lookahead = b : /* Use A BA. Invoke B on a new daughter node labeled B.Invoke A on a new daughter node labeled A. lookahead = a : /* Use A a. Create a new daughter node labeled a.LR(k) GrammarsG is LR( k), for any positive integer k, iff it is possible to build a deterministic parser for G that:• scans its input Left to right and, • for any input string in L(G), builds a Rightmost derivation, • looking ahead at most k symbols. A language is LR(k) iff there is an LR(k) grammar for it.LR(k) Grammars•The class of LR(k) languages is exactly the class of deterministic context-free languages.•If a language is LR(k), for some k, then it is also
View Full Document