DOC PREVIEW
Berkeley COMPSCI 164 - Building a Parser I

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Prof. Bodik CS 164 Lecture 5 1Building a Parser ICS1643:30-5:00 TT10 EvansProf. Bodik CS 164 Lecture 52PA2• in PA2, you’ll work in pairs, no exceptions– except the exception if odd # of students• hate team projects? form a “coalition team”– team members work alone, but• discuss design, clarify the handout, keep a common eye on the newsgroup, etc• share some or all code, at the very least their test cases!– a win-win proposition:• work mainly alone but hedge your grade• each member submits his/her project, graded separately• score: the lower-scoring team member gets a bonus equal to half the difference between his and his partner’s scoreProf. Bodik CS 164 Lecture 53Administrativia• Section room change– 3113 Etcheverry moving next door, to 3111 Etch.– starting 9/22.Prof. Bodik CS 164 Lecture 54Overview• What does a parser do, again?– its two tasks– parse tree vs. AST• A hand-written parser– and why it gets hard to get it rightWhat does a parser do?Prof. Bodik CS 164 Lecture 56Recall: The Structure of a Compilerscannerparsercheckercode genDecaf program (stream of characters)stream of tokensAbstract Syntax Tree (AST) AST with annotations (types, declarations)maybe x862Prof. Bodik CS 164 Lecture 57Recall: Syntactic Analysis• Input: sequence of tokens from scanner• Output: abstract syntax tree• Actually,– parser first builds a parse tree– AST is then built by translating the parse tree– parse tree rarely built explicitly; only determined by, say, how parser pushes stuff to stack– our lectures first focus on constructing the parse tree; later we’ll show the translation to AST.Prof. Bodik CS 164 Lecture 58Example•Decaf4*(2+3)•Parser inputNUM(4) TIMES LPAR NUM(2) PLUS NUM(3) RPAR• Parser output (AST):*NUM(4)+NUM(2) NUM(3)Prof. Bodik CS 164 Lecture 59Parse tree for the exampleleaves are tokensNUM(4) TIMES LPAR NUM(2) PLUS NUM(3) RPAREXPR EXPR EXPR Prof. Bodik CS 164 Lecture 510Another example•Decafif (x == y) { a=1; }•Parser inputIF LPAR ID EQ ID RPAR LBR ID AS INT SEMI RBR• Parser output (AST):IF-THEN==IDID=ID INTProf. Bodik CS 164 Lecture 511Parse tree for the exampleIF LPAR ID == ID RPAR LBR ID = INT SEMI RBREXPR EXPRSTMTBLOCKSTMTleaves are tokensProf. Bodik CS 164 Lecture 512Parse tree vs. abstract syntax tree•Parse tree– contains all tokens, including those that parser needs “only” to discover• intended nesting: parentheses, curly braces• statement termination: semicolons– technically, parse tree shows concrete syntax• Abstract syntax tree (AST)– abstracts away artifacts of parsing, by flattening tree hierarchies, dropping tokens, etc.– technically, AST shows abstract syntax3Prof. Bodik CS 164 Lecture 513Comparison with Lexical AnalysisAST, built from parse treeSequence of tokensParserSequence of tokensSequence of charactersLexerOutputInputPhaseProf. Bodik CS 164 Lecture 514Summary• Parser performs two tasks:–syntax checking• a program with a syntax error may produce an AST that’s different than intended by the programmer– parse tree construction• usually implicit • used to build the ASTHow to build a parser for Decaf?Prof. Bodik CS 164 Lecture 516Writing the parser• Can do it all by hand, of course– ok for small languages, but hard for Decaf• Just like with the scanner, we’ll write ourselves a parser generator– we’ll concisely describe Decaf’s syntactic structure• that is, how expressions, statements, definitions look like– and the generator produces a working parser• Let’s start with a hand-written parser– to see why we want a parser generatorProf. Bodik CS 164 Lecture 517First example: balanced parens• Our problem: check the syntax– are parentheses in input string balanced?• The simple language– parenthesized number literals– Ex.: 3, (4), ((1)), (((2))), etc• Before we look at the parser– why aren’t finite automata sufficient for this task?Prof. Bodik CS 164 Lecture 518Why can’t DFA/NFA’s find syntax errors?• When checking balanced parentheses, FA’s can either– accept all correct (i.e., balanced) programs but also some incorrect ones, or– reject all incorrect programs but also reject some correct ones.• Problem: finite state– can’t count parens seen so far())()())())4Prof. Bodik CS 164 Lecture 519Parser code preliminaries•Let TOKEN be an enumeration type of tokens:– INT, OPEN, CLOSE, PLUS, TIMES, NUM, LPAR, RPAR• Let the global in[] be the input string of tokens• Let the global next be an index in the token stringProf. Bodik CS 164 Lecture 520Parsers use stack to implement infinite stateBalanced parentheses parser:void Parse() {nextToken = in[next++];if (nextToken == NUM) return;if (nextToken != LPAR) print(“syntax error”);Parse();if (in[next++] != RPAR) print(“syntax error”);}Prof. Bodik CS 164 Lecture 521Where’s the parse tree constructed?• In this parser, the parse is given by the call tree:• For the input string (((1))) :Parse()Parse()Parse()Parse()1((()))Prof. Bodik CS 164 Lecture 522Second example: subtraction expressionsThe language of this example: 1, 1-2, 1-2-3, (1-2)-3, (2-(3-4)), etcvoid Parse() {if (in[++next] == NUM) {if (in[++next] == MINUS) { Parse(); }} else if (in[next] == LPAR) {Parse();if (in[++next] != RPAR) print(“syntax error”);} else print(“syntax error”);}Prof. Bodik CS 164 Lecture 523Subtraction expressions continued•Observations:– a more complex language• hence, harder to see how the parser works (and if it works correctly at all)– the parse tree is actually not really what we want• consider input 3-2-1• what’s undesirable about this parse tree’s structure?-3Parse()-12Parse()Parse()Prof. Bodik CS 164 Lecture 524We need a clean syntactic description• Just like with the scanner, writing the parser by hand is painful and error-prone– consider adding +, *, / to the last example!• So, let’s separate the what and the how–what:the syntactic structure, described with a context-free grammar–how:the parser, which reads the grammar, the input and produces the parse


View Full Document

Berkeley COMPSCI 164 - Building a Parser I

Documents in this Course
Lecture 8

Lecture 8

40 pages

Load more
Download Building a Parser I
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Building a Parser I and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Building a Parser I 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?