U of I CS 421 - Bottom-up Parsing - D2260545

Home> Schools> University of Illinois> Computer Science (CS) > CS 421> Bottom-up Parsing

DOC PREVIEW

U of I CS 421 - Bottom-up Parsing

School name University of Illinois

Course Cs 421- Natural Language Processing

Pages 27

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 421 Lecture 9: Bottom-up Parsing Announcements OCaml self-help hints Lecture outline Bottom-up parsing ocamlyacc6/18/20091Announcements MP4 has been posted MiniJava lexer Reminder: midterm exam date – Thursday, July 26/18/20092OCaml self-help hints Consult the CS 421 resource guide: http://www.cs.uiuc.edu/class/su09/cs421/ Use “Tips for using OCaml top level” to speed up working with the interactive environment Consult the OCaml manual when you want a definitive answer about something May be technical, not “user-friendly” Ask on the newsgroup If you are having a problem, it’s likely somebody has run into it already, or they will in the future. Ask Google It probably knows…6/18/20093OCaml self-help hints Be careful about Data types, and type inference Operator precedence Common OCaml error messages: syntax error (underlined) unbound value use (underlined) Pattern matching is not exhaustive. Here is a counterexample: … This expression has type <type1> but is here used with <type2> Watch out especially for “unit” <whatever error> in <file>.ml at line <line> characters <chars>6/18/20094Top-down vs. bottom-up parsing Why is top-down called “top-down?” As we consume tokens, we build a parse tree. At any one time, we are filling in the children of a particular non-terminal.As soon as we decide which production to use, we can fill in the tree. In this sense, we are building the tree from the top (root) down(to the leaves). Nature and Computer Science disagree on this point6/18/20095Top-down parsing Example: Input: x + y * z E → id T T → ε | + E | * E6/18/20096Bottom-up parsing Works by creating small parse trees and joining them together into larger ones. Example: Input: x + y * z E → id T T → ε | + E | * EStart constructing trees, put them on stack: Construct tree x: {x} Add tree +: {x, +} Add tree y: {x, +, y} Add tree *: {x, +, y, *} Add tree z: {x, +, y, *, z}6/18/20097Bottom-up parsing (cont) Construct parse tree by merging: {x, +, y, *, z} Apply T → ε {x, +, y, *, z, T → ε} …6/18/20098How bottom-up parsing works Keep a stack of small parse trees. Based on what’s in this stack, and the next input token, take one of these actions: Shift: move lookahead token to stack Reduce A → α: if roots of trees on stack match α, replace those trees on stack by single tree with root A Accept: reduce when non-terminal is the start symbol, look-ahead is EOF Reject Bottom-up parsing is also called shift-reduce parsing6/18/20099Shift-reduce example 1 Example: Input: x ; y ; z L → L ; E | E E → idAction Stack InputS x ; y ; zR E→idx ; y ; z…6/18/200910Shift-reduce example 1 Example: L → L ; E | E E → idAction Stack Input6/18/200911Shift-reduce example 2 Example: Input: x + 10 * y E → E + T | T T → T * P | P P → id | intAction Stack InputS x + 10 * yR P→idx + 10 * y…6/18/200912Shift-reduce example 2 Example: E → E + T | T T → T * P | P P → id | intAction Stack Input6/18/200913Bottom-up parsing This is hard! How can we build a parser that works like this? Shift-reduce parsing is not usually done “by hand” Automated parser generator tools Generate parser code based on grammar specification Similar to ocamllex and regular expressions for lexing Ocaml’s parser generator is called ocamlyacc “yet another compiler-compiler”6/18/200914Using ocamlyacc Create grammar specification in a text file <grammar>.mly Execute ocamlyacc <grammar>.mly Produces code for parser in <grammar>.ml interface (including type declaration for tokens) in <grammar.mli>6/18/200915Parser code <grammar>.ml defines one parsing function per entry point Parsing function takes a lexing function (lexbuf -> token) and a lexbuf as arguments Aside: we’ll see more functions being passed around as arguments soon… Returns semantic attributeof corresponding entry point6/18/200916Example – expression grammar We will take a simple expression grammar and create a parser to parse inputs and produce abstract syntax Grammar: M → Exp eof Exp → Term | Term + Exp | Term - Exp Term → Factor | Factor * Term | Factor / Term Factor → id | ( Exp ) Abstract syntax(* file: expr.ml *)type expr =Plus of expr * expr| Minus of expr * expr| Mult of expr * expr| Div of expr * expr| Id of string6/18/200917Example – lexer(* file: exprlex.mll *)let numeric = [‘0’ – ‘9’]let letter = [‘a’ – ‘z’ ‘A’ – ‘Z’]rule tokenize = parse| “+” {Plus_token}| “-” {Minus_token}| “*” {Times_token}| “/” {Divide_token}| “(” {Left_parenthesis}| “)” {Right_parenthesis}| letter (letter | numeric | “_”)* as id {Id_token id}| [‘ ’ ‘\t’ ‘\n’] {tokenize lexbuf}| eof {EOL}6/18/200918Example – parser(* file: exprparse.mly *)%{ open Expr%}%token <string> Id_token%token Left_parenthesis Right_parenthesis%token Times_token Divide_token%token Plus_token Minus_token%token EOL%start main%type <expr> main%%...6/18/200919Example – parser (exprparse.mly)expr:term {$1}| term Plus_token expr {Plus($1,$3)}| term Minus_token expr {Minus($1,$3)}term:factor {$1}| factor Times_token term {Mult($1,$3)}| factor Divide_token term {Div($1,$3)}factor:Id_token {Id $1}| Left_parenthesis expr Right_parenthesis {$2}main:| expr EOL {$1}6/18/200920Example – using parser# #use “expr.ml”;;…# #use “expparse.ml”;;…# #use “exprlex.ml”;;…# let test s =let lexbuf = Lexing.from_string(s^”\n”) inmain tokenize lexbuf;;…# test “a + b”;;- : expr = Plus(Id “a”, Id “b”) 6/18/200921ocamlyacc input File format:%{ <header>%}<declarations>%%<rules>%%<trailer>6/18/200922ocamlyacc <header> Contains arbitrary OCaml code Typically used to give types and functions needed for the semantic actions of rules and to give specialized error recovery May be omitted <footer> is similar. Possibly used to call parser.6/18/200923ocamlyacc <declarations>%token symbol…symbol Declare given symbols as tokens%token <type>

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-25-26-27 out of 27 pages.

U of I CS 421 - Bottom-up Parsing

Sign up for free to view:

Please select your school