Unformatted text preview:

MIT 6 035 Top Down Parsing Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Orientation Language specification Lexical structure regular expressions Syntactic structure grammar This Lecture recursive descent parsers Code parser as set of mutually recursive procedures Structure of program matches structure of grammar Starting Point Assume lexical analysis has produced a sequence of tokens Each token has a type and value Types correspond to terminals Values to contents of token read in Examples Int 549 integer token with value 549 read in if if keyword no need for a value AddOp add operator value Basic Approach Start with Start symbol Build a leftmost derivation If leftmost symbol is nonterminal choose a production and apply it If leftmost symbol is terminal match against input If all terminals match have found a parse Key find correct productions for nonterminals Graphical Illustration of Leftmost Derivation Sentential Form NT1 T1 T2 T3 NT2 NT3 Apply Production Here Not Here Grammar for Parsing Example Start Expr Expr Expr Term Expr Expr Term Expr Term Term Term Int Term Term Int Term Int Set of tokens is Int where Int 0 9 0 9 For convenience may represent each Int n token by n Parsing Example Parse Tree Start Remaining Input 2 2 2 Sentential Form Start Current Position in Parse Tree Parsing Example Parse Tree Start Expr Remaining Input 2 2 2 Sentential Form Expr Applied Production Current Position in Parse Tree Start Expr Parsing Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Sentential Form Term Expr Expr Term Expr Expr Term Expr Term Expr Term Applied Production Expr Expr Term Parsing Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Term Sentential Form Term Expr Expr Term Expr Expr Term Expr Term Term Term Applied Production Expr Term Parsing Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Term Int Sentential Form Term Int Term Applied Production Term Int Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Remaining Input 2 2 2 Sentential Form Term 2 Term Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Remaining Input 2 2 Sentential Form Term 2 Term Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Remaining Input 2 2 Sentential Form Term 2 Term Parsing Example Parse Tree Remaining Input Start 2 2 Expr Expr Term Int 2 Term Sentential Form Term 2 Term Int Int Applied Production Term Term Int Parsing Example Parse Tree Remaining Input Start 2 2 Expr Expr Term Int 2 Term Int Sentential Form Term 2 Int Int Int Applied Production Term Int Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Term Int 2 Remaining Input 2 2 Sentential Form Term 2 2 Int Int Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Term Int 2 Remaining Input 2 Sentential Form Term 2 2 Int Int Parsing Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Term Int 2 Remaining Input 2 Sentential Form Term 2 2 Int Int Parsing Example Parse Tree Parse Complete Start Expr Expr Term Int 2 Term Int 2 Remaining Input 2 Sentential Form Term 2 2 2 Int 2 Summary Three Actions Mechanisms Apply production to expand current nonterminal in parse tree Match current terminal consuming input Accept the parse as correct Parser generates preorder traversal of parse tree visit parents before children visit siblings from left to right Policy Problem Which production to use for each nonterminal Classical Separation of Policy and Mechanism One Approach Backtracking Treat it as a search problem At each choice point try next alternative If it is clear that current try fails go back to previous choice and try something different General technique for searching Used a lot in classical AI and natural language processing parsing speech recognition Backtracking Example Parse Tree Remaining Input Start 2 2 2 Sentential Form Start Backtracking Example Parse Tree Remaining Input Start Expr 2 2 2 Sentential Form Expr Applied Production Start Expr Backtracking Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Sentential Form Term Expr Term Applied Production Expr Expr Term Backtracking Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Term Sentential Form Term Term Term Applied Production Expr Term Backtracking Example Parse Tree Match Input Token Start Expr Expr Term Int Term Remaining Input 2 2 2 Sentential Form Int Term Applied Production Term Int Backtracking Example Parse Tree Can t Match Input Token Start Expr Expr Term Int 2 Term Remaining Input 2 2 Sentential Form 2 Term Applied Production Term Int Backtracking Example Parse Tree Start Expr So Backtrack Remaining Input 2 2 2 Sentential Form Expr Applied Production Start Expr Backtracking Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Sentential Form Term Expr Term Applied Production Expr Expr Term Backtracking Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Term Sentential Form Term Term Term Applied Production Expr Term Backtracking Example Parse Tree Remaining Input Start 2 2 2 Expr Expr Term Int Sentential Form Term Int Term Applied Production Term Int Backtracking Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Term Remaining Input 2 2 Sentential Form 2 Term Backtracking Example Parse Tree Match Input Token Start Expr Expr Term Int 2 Term Remaining Input 2 2 Sentential Form 2 Term Left Recursion Top Down Parsing Infinite Loop Example Production Term Term Num Potential parsing steps Term Term Term Term Num Term Term Num Num General Search Issues Three components Search space parse trees Search algorithm parsing algorithm Goal to find parse tree for input program Would like to but can t always ensure that Find goal hopefully quickly if it exists Search terminates if it does not Handled in various ways in various contexts Finite search space makes it easy Exploration strategies for infinite search space Sometimes one goal more important model checking For parsing hack grammar to remove left recursion Eliminating Left Recursion Start with productions of form A A A sequences of terminals and nonterminals that do not start with A Repeated application of A A A builds parse tree like this A A Eliminating Left Recursion Replacement productions A A A A R R R R Old Parse Tree A A R is a new nonterminal New Parse Tree A R R R Hacked Grammar Original Grammar Fragment Term Term Int Term Term Int Term Int New Grammar Fragment Term Int Term Term Int Term Term Int Term Term


View Full Document

MIT 6 035 - Top-Down Parsing

Download Top-Down Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Top-Down Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Top-Down Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?