Unformatted text preview:

ASTs Objective CAML and Ocamlyacc Stephen A Edwards Columbia University Fall 2008 Parsing and Syntax Trees Parsing decides if the program is part of the language Not that useful we want more than a yes no answer Like most parsers generated by ocamlyacc can include actions pieces of code that run when a rule is matched Top down parsers actions executed during parsing rules Bottom up parsers actions executed when rule is reduced ocamlyacc Actions Simple languages can be interpreted just with parser actions token PLUS MINUS TIMES DIVIDE token EOF token int LIT open Calc parse rule token parse n token lexbuf PLUS MINUS TIMES DIVIDE 0 9 as s LIT int of string s eof EOF left PLUS MINUS left TIMES DIVIDE start top type int top top expr EOF 1 expr expr expr expr expr LIT PLUS expr MINUS expr TIMES expr DIVIDE expr 1 1 1 1 1 3 3 3 3 let let lexbuf Lexing from channel stdin in print int Calc parse top Calc scan token lexbuf print newline Actions Even in a parser for an interpreter actions usually build a data structure that represents the program Separates parsing from translation Makes modification easier by minimizing interactions Allows parts of the program to be analyzed in different orders Bottom up parsers can only build bottom up data structures Children known first parents later Context of an object only established later What To Build Typically an Abstract Syntax Tree that represents the program Represents the syntax of the program almost exactly but easier for later passes to deal with Punctuation whitespace other irrelevant details omitted Abstract vs Concrete Trees Like scanning and parsing objective is to discard irrelevant details E g comma separated lists are nice syntactically but later stages probably just want lists AST structure almost a direct translation of the grammar Abstract vs Concrete Trees expr mexpr PLUS mexpr mexpr atom TIMES atom atom INT 3 5 4 expr mexpr PLUS mexpr atom atom TIMES atom INT 3 INT 5 INT 4 Plus INT 3 Times INT 5 INT 4 Concrete Parse Tree Abstract Syntax Tree Part I Designing ASTs Designing an AST Structure Sequences of things Removing unnecessary punctuation Additional grouping How to factor One Way to Handle Comma Separated Lists type args Arg of arg Args of args arg type arg args LPAREN arglist RPAREN 2 arglist arglist COMMA arg Args 1 3 arg Arg 1 int gcd int a int b int c Args Args int c Arg int b int a Drawbacks Many unnecessary nodes Branching suggests recursion Harder for later routines to get the data they want A Better Way to Handle Comma Separated Lists Better to choose a simpler structure for the tree use lists type args Args of arg list type arg args LPAREN arglist RPAREN Args List rev 2 arglist arglist COMMA arg 3 1 arg 1 int gcd int a int b int c Args int a int b int c Removing Unnecessary Punctuation Punctuation makes the syntax readable unambiguous Information represented by structure of the AST Things typically omitted from an AST Parentheses Grouping and precedence associativity overrides Separators commas semicolons Mark divisions between phrases Probably want a list of items in the AST Extra keywords while do if then else Just want a While constructor with two children How to factor Two possible ways to represent binary operators type expr Plus of expr expr Minus of expr expr Times of expr expr type binop Plus Minus Times type expr Binop of expr binop expr Each has advantages and disadvantages Main question is how nice the later code looks Is each operator a special case or can you handle them all at once Part II Walking ASTs M C Escher Ascending and Descending 1960 It s easy in O Caml type operator Add Sub Mul Div type expr Binop of expr operator expr Lit of int let rec eval function Lit x x Binop e1 op e2 let v1 eval e1 and v2 eval e2 in match op with Add v1 v2 Sub v1 v2 Mul v1 v2 Div v1 v2 Comments on ASTs Two ways to handle optional clauses type stmt If of expr stmt stmt option let rec eval function If e s1 None If e s1 Some s2 or let rec eval function If e s1 s2 match s2 with None Some s type stmt If of expr stmt IfElse of expr stmt stmt let rec eval function If e s IfElse e s1 s2


View Full Document

Columbia COMS W4115 - ASTs, Objective CAML, and Ocamlyacc

Documents in this Course
YOLT

YOLT

13 pages

Lattakia

Lattakia

15 pages

EasyQL

EasyQL

14 pages

Photogram

Photogram

163 pages

Espresso

Espresso

27 pages

NumLang

NumLang

6 pages

EMPATH

EMPATH

14 pages

La Mesa

La Mesa

9 pages

JTemplate

JTemplate

238 pages

MATVEC

MATVEC

4 pages

TONEDEF

TONEDEF

14 pages

SASSi

SASSi

16 pages

JTemplate

JTemplate

39 pages

BATS

BATS

10 pages

Synapse

Synapse

11 pages

c.def

c.def

116 pages

TweaXML

TweaXML

108 pages

Load more
Loading Unlocking...
Login

Join to view ASTs, Objective CAML, and Ocamlyacc and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view ASTs, Objective CAML, and Ocamlyacc and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?