Bison and parsingFrom the area of compilers, we get a host of tools to convert text filesinto programs. After lexical analysis, the second part of that processwhen you are dealing with traditional languages such as C is syntaxanalysis, which also known as parsing.A good tool for creating parsers is bison. It takes a specification fileand creates an syntax analyzer, previously called y.tab.c by yaccand now is generally just FILENAME.tab.c.Unix Tools: Program Development 5Parsing termsProduction rules define a parser. Informally, these can beexpressed in BNF/EBNF form.Production rules are made up a left hand side with anon-terminal, and righthand side made up terminals andnon-terminals.A terminal “represents a class of syntactically equivalent tokens”[Bison manual].Unix Tools: Program Development 5Attributes for terminals and non-terminalsTerminals and non-terminals can have attributes.Constants could have the value of the constant, for instance.Identifiers might have a pointer to a location where information iskept about the identifier.Unix Tools: Program Development 5Some general approaches to syntax analysisUse a compiler-compiler tool, such as bison.Write a one-off recursive descent parser.Write a one-off parser suited to your program.Unix Tools: Program Development 5Bison - our lexical analyzer generatorCan be called as yyparse().It is easy to interface with flex/lex.*y file → bison → y.tab.c (*.tab.c)y.tab.c and → gcc → syntax analyzerother filesinput stream → syntax analyzer → actions takenwhen rules appliedUnix Tools: Program Development 5Calling BisonHere’s an example of calling Bison (which will be very useful whencompiling assign6):Assign6-solution.out: Assign6-solution.y Assign6-solution.lbison -d --debug --verbose Assign6-solution.yflex Assign6-solution.lcc -c lex.yy.ccc -c Assign6-solution.tab.ccc -o Assign6-solution.out Assign6-solution.tab.o lex.yy.oThe -d option specifies to output an explicit y.tab.h/*.tab.hfile for flex. Specifying --debug and --verbose (combined withenabling yydebug) make it much easier to debug your parser!Unix Tools: Program Development 5Bison specificationsBison source:{ definitions }%%{ rules }%%{ user subroutines }Unix Tools: Program Development 5DefinitionsDeclarations of ordinary C variables and constants.bison declarations.Unix Tools: Program Development 5RulesThe general form for production rules is:<non-terminal> : <sequence of terminals and non-terminals> {action} | ... ;The actions are C/C++ code. Actions can appear in the middle of thesequence of terminals and non-termianls.Unix Tools: Program Development 5Bison declarations%token TOKEN create a TOKEN type%union { } create a Union for llvals.%right TOKEN create a TOKEN type that has right associativity%left TOKEN create a TOKEN type that has left associativityUnix Tools: Program Development 5Bison actionsActions are C source fragments.Example rules:variableDeclaration : ID COLON ID SEMICOLON {printf("emitting var %s of type %s\n",$3,$1);} ;The $3 and $1 refer to the values of the items 3 and 1 in therighthand side of the production rule.Unix Tools: Program Development 5An example of Bison: first, its matching flex file%{#include <stdlib.h>#include <string.h>#include "Assign6-solution.tab.h"extern int linecount;%}%%program return PROGRAM;end return END;variables return VARIABLES;var return VAR;functions return FUNCTIONS;define return DEFINE;statements return STATEMENTS;if return IF;then return THEN;else return ELSE;while return WHILE;, return COMMA;Unix Tools: Program Development 5Flex file cont’d"(" return LPARENTHESIS;")" return RPARENTHESIS;"{" return LBRACE;"}" return RBRACE;: return COLON;; return SEMICOLON;[a-zA-Z0-9]+ yylval = (int)strdup(yytext); return ID;[\n] linecount++;[ \t]+Unix Tools: Program Development 5An example Bison program%{#include <stdlib.h>#include <stdio.h>int linecount = 0;void yyerror(char*s){fprintf(stderr,"file is not okay -- problem at line %d\n",linecount);exit(1);}int yywrap(){return 1;}Unix Tools: Program Development 5An example Bison program%}%token ID%token PROGRAM%token END%token VARIABLES%token VAR%token STATEMENTS%token IF%token THEN%token ELSE%token WHILE%token LBRACE%token RBRACE%token COLON%token SEMICOLON%token FUNCTIONS%token COMMA%token DEFINE%token LPARENTHESIS%token RPARENTHESIS%%Unix Tools: Program Development 5An example Bison programprogram : PROGRAM ID variablesSection functionsSection statementsSection END ;variablesSection : VARIABLES LBRACE variableDeclarations RBRACE ;variableDeclarations : | variableDeclarations variableDeclaration ;variableDeclaration : ID COLON ID SEMICOLON {printf("emitting var %s of type %s\n",$3,$1);} ;functionsSection : FUNCTIONS LBRACE functionDeclarations RBRACE ;functionDeclarations : | functionDeclarations functionDeclaration ;functionDeclaration : DEFINE ID COLON ID LPARENTHESIS argsList RPARENTHESIS LBRACE statements RBRACE ;statementsSection : STATEMENTS LBRACE statements RBRACE ;statements : | statements statement ;statement : VAR variableDeclaration | whileLoop | ifStruct | subroutineCall SEMICOLON ;whileLoop : WHILE LPARENTHESIS subroutineCall RPARENTHESIS LBRACE statements RBRACE ;Unix Tools: Program Development 5An example Bison programifStruct : IF LPARENTHESIS subroutineCall RPARENTHESIS LBRACE statements RBRACE ;|IF LPARENTHESIS subroutineCall RPARENTHESIS LBRACE statements RBRACE ELSE LBRACE statements RBRACE ;subroutineCall : ID LPARENTHESIS callArgsList RPARENTHESIS ;argsList : | argPair | argsList COMMA argPair ;argPair : ID ID ;callArgsList : | ID | callArgsList COMMA ID ;%%int main(int argc, char**argv){// yydebug = 1;yyparse();printf("input is okay\n");}Unix Tools: Program Development
View Full Document