Designing ASTsWalking ASTsASTs, Objective CAML, and OcamlyaccStephen A. EdwardsColumbia UniversityFall 2008Parsing and Syntax TreesParsing decides if the program is part of the language.Not that useful: we want more than a yes/no a nswer.Like most, parsers generated by ocamlyacc can include actions:pieces of code that run when a r ule is matched.Top-down parsers: actions executed during parsing rules.Bottom-up parsers: actions executed when rule is “reduced.”(ocamlyacc)ActionsSimple languages can be interpreted just with parser actions.%token PLUS MINUS TIMES DIVIDE%token EOF%token <int> LIT%left PLUS MINUS%left TIMES DIVIDE%start top%type <int> top%%top : expr EOF { $1 }expr: expr PLUS expr { $1 + $3 }| expr MINUS expr { $1 - $3 }| expr TIMES expr { $1*$3 }| expr DIVIDE expr { $1 / $3 }| LIT { $1 }{ open Calc_parse }rule token = parse[’ ’ ’\n’] { token lexbuf }| ’+’ { PLUS }| ’-’ { MINUS }| ’*’ { TIMES }| ’/’ { DIVIDE }| [’0’-’9’]+ as s{ LIT(int_of_string s) }| eof { EOF }let _ =let lexbuf =Lexing.from_channel stdin inprint_int (Calc_parse.topCalc_scan.tokenlexbuf);print_newline ()ActionsEven in a parser for an interpreter, actions usua lly build a datastructure th at represents the program.Separates parsing from translation.Makes modification easier by minimizing interactions.Allows parts of the program to be analyzed in different orders.Bottom-up p arsers can only build bottom-up data structures.Children known first, parents later.Context of an object only established later.What To Build?Typically, an Abstract Syntax Tree that represents the program.Represents the syntax of the program almost exactly, but easier forlater passes to deal with.Punctuation, whitespace, other irrelevant details omitted.Abstract vs. Concrete TreesLike scanning and parsing, objective is to discard irrelevant details.E.g., comma-separated lists are nice syntactically, but later stagesprobably just want lists.AST structure almost a direct translation of the grammar.Abstract vs. Concrete Treesexpr : mexpr PLUS mexpr { ... };mexpr : atom TIMES atom { ... };atom : INT { ... };3 + 5*4exprmexpratomINT(3)PLUS mexpratomINT(5)TIMES atomINT(4)PlusINT(3)TimesINT(5) INT(4)Concrete Parse Tree Abstract Syntax TreePart IDesigning ASTsDesigning an AST StructureSequences of thingsRemoving unnecessary pu nctuationAdditional groupingHow t o factorOne Way to Handle Comma-Separated Liststype args = Arg of arg | Args of args*argtype arg = ...args : LPAREN arglist RPAREN { $2 };arglist : arglist COMMA arg { Args($1, $3) }| arg { Arg($1) };int gcd(int a, int b, int c)ArgsArgsArgint aint bint cDrawbacks:Many u nnecessary nodesBranching suggests recursionHarder for later routines to get thedata they wantA Better Way to Handle Comma-Separated ListsBetter to choose a simpler structure for the tree: use liststype args = Args of arg listtype arg = ...args : LPAREN arglist RPAREN { Args(List.rev $2) };arglist : arglist COMMA arg { $3::$1 }| arg { [$1] }int gcd(int a, int b, int c)Argsint aint b int cRemoving Unnecessary PunctuationPunctuation makes the syntax readable, unambiguous.Informat ion represented by stru cture of the ASTThings typically omitted from an ASTÏParenthesesGrouping and precedence/associativity overridesÏSeparators (commas, semicolons)Mark divisions between phrases. Probably want a list of itemsin the AST.ÏExtra keywordswhile-do, if-then-else. Just wan t a “While” constructor withtwo children.How to fa ctorTwo possible ways to represent binary operators:type expr =Plus of expr*expr| Minus of expr*expr| Times of expr*expr| ...type binop = Plus | Minus | Times | ...type expr =Binop of expr*binop*expr| ...Each has advantages and disadvantagesMain question is how nice the later code looks. Is each operator aspecial case, or can you handle them all at once?Part IIWalking ASTsM. C. Escher, Ascending and Descending, 1960It’s easy in O’Camltype operator = Add | Sub | Mul | Divtype expr =Binop of expr*operator*expr| Lit of intlet rec eval = functionLit(x) -> x| Binop(e1, op, e2) ->let v1 = eval e1 and v2 = eval e2 inmatch op withAdd -> v1 + v2| Sub -> v1 - v2| Mul -> v1*v2| Div -> v1 / v2Comments on ASTsTwo ways to handle optional clauses:type stmt =If of expr*stmt*stmt option| ...let rec eval = functionIf(e, s1, None) -> ...| If(e, s1, Some(s2)) -> ...(*or*)let rec eval = functionIf(e, s1, s2) -> ...match s2 withNone -> ...| Some(s) -> ...| ...type stmt =If of expr*stmt| IfElse of expr*stmt*stmt| ...let rec eval = functionIf(e, s) -> ...| IfElse(e, s1, s2) ->
View Full Document