DOC PREVIEW
UW-Madison CS 536 - CS 536 Lecture Notes

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

228CS 536 Spring 2008©ExampleLet’s look at the CUP specificationfor CSX-lite. Recall its CFG isprogram → { stmts }stmts → stmt stmts| λstmt → id = expr ;| if ( expr ) stmtexpr → expr + id| expr - id|id229CS 536 Spring 2008©The corresponding CUPspecification is:/***This Is A Java CUP Specification ForCSX-lite, a Small Subset of The CSXLanguage, Used In Cs536 ***//* Preliminaries to set up and use thescanner. */import java_cup.runtime.*;parser code {: public void syntax_error(Symbol cur_token){ report_error(“CSX syntax error at line “+String.valueOf(((CSXToken)cur_token.value).linenum),null);}:};init with {: :};scan with {:return Scanner.next_token();:};230CS 536 Spring 2008©/* Terminals (tokens returned by thescanner). */terminal CSXIdentifierToken IDENTIFIER;terminal CSXToken SEMI, LPAREN, RPAREN,ASG, LBRACE, RBRACE;terminal CSXToken PLUS, MINUS, rw_IF;/* Non terminals */non terminal csxLiteNode prog;non terminal stmtsNode stmts;non terminal stmtNode stmt;non terminal exprNode exp;non terminal nameNode ident;start with prog;prog::= LBRACE:l stmts:s RBRACE {: RESULT=new csxLiteNode(s,l.linenum,l.colnum); :};stmts::= stmt:s1 stmts:s2 {: RESULT=new stmtsNode(s1,s2,s1.linenum,s1.colnum); :}231CS 536 Spring 2008©| {: RESULT= stmtsNode.NULL; :};stmt::= ident:id ASG exp:e SEMI {: RESULT=new asgNode(id,e,id.linenum,id.colnum); :}| rw_IF:i LPAREN exp:e RPAREN stmt:s {: RESULT=new ifThenNode(e,s, stmtNode.NULL,i.linenum,i.colnum); :};exp::=exp:leftval PLUS:op ident:rightval {: RESULT=new binaryOpNode(leftval,sym.PLUS, rightval,op.linenum,op.colnum); :}| exp:leftval MINUS:op ident:rightval {: RESULT=new binaryOpNode(leftval,sym.MINUS,rightval,op.linenum,op.colnum); :}| ident:i {: RESULT = i; :};232CS 536 Spring 2008©ident::= IDENTIFIER:i {: RESULT = new nameNode(new identNode(i.identifierText, i.linenum,i.colnum),exprNode.NULL,i.linenum,i.colnum); :};233CS 536 Spring 2008©Let’s parse{ a = b ; }First, a is parsed usingident::= IDENTIFIER:i {: RESULT = new nameNode(new identNode(i.identifierText, i.linenum,i.colnum),exprNode.NULL,i.linenum,i.colnum); :}We buildnameNodeidentNodenullExprNodea234CS 536 Spring 2008©Next, b is parsed usingident::= IDENTIFIER:i {: RESULT = new nameNode(new identNode(i.identifierText, i.linenum,i.colnum),exprNode.NULL,i.linenum,i.colnum); :}We buildnameNodeidentNodenullExprNodeb235CS 536 Spring 2008©Then b’s subtree is recognized asan exp:| ident:i {: RESULT = i; :}Now the assignment statement isrecognized:stmt::= ident:id ASG exp:e SEMI {: RESULT=new asgNode(id,e,id.linenum,id.colnum); :}We buildnameNodeidentNodenullExprNodeanameNodeidentNodenullExprNodebasgNode236CS 536 Spring 2008©The stmts →λ production ismatched (indicating that there areno more statements in theprogram).CUP matchesstmts::= {: RESULT= stmtsNode.NULL; :}and we buildNext,stmts → stmt stmtsis matched usingstmts::= stmt:s1 stmts:s2 {: RESULT=new stmtsNode(s1,s2,s1.linenum,s1.colnum); :}nullStmtsNode237CS 536 Spring 2008©This buildsAs the last step of the parse, theparser matchesprogram → { stmts }using the CUP ruleprog::= LBRACE:l stmts:s RBRACE {: RESULT=new csxLiteNode(s,l.linenum,l.colnum); :};nameNodeidentNodenullExprNodeanameNodeidentNodenullExprNodebasgNodestmtsNodenullStmtsNode238CS 536 Spring 2008©The final AST reurned by theparser isnameNodeidentNodenullExprNodeanameNodeidentNodenullExprNodebasgNodestmtsNodenullStmtsNodecsxLiteNode239CS 536 Spring 2008©Errors in Context-FreeGrammarsContext-free grammars cancontain errors, just as programsdo. Some errors are easy to detectand fix; others are more subtle.In context-free grammars we startwith the start symbol, and applyproductions until a terminal stringis produced.Some context-free grammars maycontain useless non-terminals.Non-terminals that areunreachable (from the startsymbol) or that derive no terminalstring are considered useless.Useless non-terminals (andproductions that involve them)can be safely removed from a240CS 536 Spring 2008©grammar without changing thelanguage defined by the grammar.A grammar containing uselessnon-terminals is said to be non-reduced.After useless non-terminals areremoved, the grammar is reduced.ConsiderS → AB| xB → bA → aAC → dWhich non-terminals areunreachable? Which derive noterminal string?241CS 536 Spring 2008©Finding Useless Non-terminalsTo find non-terminals that canderive one or more terminalstrings, we’ll use a markingalgorithm.We iteratively mark terminals thatcan derive a string of terminals,until no more non-terminals canbe marked. Unmarked non-terminals are useless.(1) Mark all terminal symbols(2) RepeatIf all symbols on therighthand side of aproduction are markedThen mark the lefthand sideUntil no more non-terminalscan be marked242CS 536 Spring 2008©We can use a similar markingalgorithm to determine whichnon-terminals can be reachedfrom the start symbol:(1) Mark the Start Symbol(2) RepeatIf the lefthand side of aproduction is markedThen mark all non-terminalsin the righthand sideUntil no more non-terminalscan be marked243CS 536 Spring 2008©λ DerivationsWhen parsing, we’ll sometimesneed to know which non-terminalscan derive λ. (λ is “invisible” andhence tricky to parse).We can use the following markingalgorithm to decide which non-terminals derive λ(1) For each production A →λmark A(2) RepeatIf the entire righthandside of a productionis markedThen mark the lefthand sideUntil no more non-terminalscan be marked244CS 536 Spring 2008©As an example considerS → A B CA → aB → CDD → d| λC → c| λ245CS 536 Spring 2008©Recall that compilers prefer anunambiguous grammar because aunique parse tree structure can beguaranteed for all inputs.Hence a unique translation,guided by the parse treestructure, will be obtained.We would like an algorithm thatchecks if a grammar isambiguous.Unfortunately, it is undecidablewhether a given CFG isambiguous, so such an algorithmis impossible to create.Fortunately for certain grammarclasses, including those for whichwe can generate parsers, we canprove included grammars areunambiguous.246CS 536 Spring 2008©Potentially, the most serious flawthat a grammar might have is thatit generates the “wronglanguage."This is a subtle point as agrammar serves as the definitionof a language.For established languages (like Cor Java) there is usually a suite ofprograms created to test andvalidate new compilers. Anincorrect grammar will almostcertainly lead to


View Full Document

UW-Madison CS 536 - CS 536 Lecture Notes

Download CS 536 Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 536 Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 536 Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?