Unformatted text preview:

Wright State UniversityDepartment of Computer Science and EngineeringCS 780 Fall 2010 PrasadAssignment 2 (Due: Oct 18) (10 pts)1 OverviewThe Cool programming assignments II–V will lead you to design and build a compiler for Cool. Each assignment willcover one component of the compiler: lexical analysis, parsing, semantic analysis, and code generation (assignmentsIV and V will be given in the next offering of CS781).For this assignment you are to write a lexical analyzer, also called a scanner or tokenizer, using the tool flex.You will describe the set of tokens for Cool in flex input format. An HTML manual for flex is available at URLhttp://www.gnu.org/software/flex/manual/html mono/flex.html.You will be working in pairs on this assignment. Even though it is a good idea to attempt the assignment onyour own after discussions with your classmate, you are required to turn in only one copy of the code and thewriteup. Clearly identify the members of your group in the README file.2 Files and DirectoriesTo get started, you should type/usr/local/bin/make -f /usr/local/lib/cool/assignments/PA2/Makefilein a directory where you want to do the assignment. (You can use “gmake” instead of “make” too.) This commandwill copy a number of files into your directory. Some of the files will be copied as read-only (using symbolic links).You should not edit these files. In fact, if you make and modify private copies of these files, you may find itimpossible to complete the assignment. See the instructions in the README file.The files that you will need to modify are:• cool.flexThis file contains a token start (no pun intended) at a flex description for Cool. You can actually build ascanner with this description but it does not do much. You should read the man pages for flex to figure outwhat this description does do. Any auxiliary C++ routines that you wish to write should be added directlyto the cool.flex file after the last %%.• test.clThis file contains some sample input to be scanned. It does not exercise all of the lexical specification but itis nevertheless an interesting test. It is not a good test to start with, nor does it provide adequate testing ofyour scanner. Part of your assignment is to come up with good testing inputs and testing strategy.You should modify this file with tests that you think adequately exercises your scanner. Our test.cl is actuallyclose to a real Cool program, but your tests need not be. You may keep as much or as little of our test asyou like.• READMEThis file contains detailed instructions for the assignment.Although these files are incomplete, the two program files do compile and run. There are a number of usefultips on using flex in the README file.All of the software supplied with this assignment is supported on the SPARC-Solaris machine, gandalf.cs.wright.edu.Important: You should make sure to place /usr/local/lib/cool/bin at the beginning of your path variable to makesure the executables used are the ones the assignments are designed for. Similarly, make sure to have /usr/local/binbefore /bin and /usr/bin to pick the correct version of flex. To do this, add the linesetenv PATH /usr/local/lib/cool/bin:/usr/local/bin:${PATH}1at the end of your .login file (or .tcshrc file) if you use tcsh; add the linepath=/usr/local/lib/cool/bin:/usr/local/bin:$pathexport pathat the end of your .profile file (or .bashrc file) if you use bash.3 Scanner ResultsYou should follow the specification of the lexical structure of Cool given in the Section 10 and Figure 1 of theCoolAid Manual. In general, we encourage you to use primitives supported by flex in preference to defining yourown macros, and using regular expressions and start conditions in preference to writing C code. Your scannershould be robust—it should work for any conceivable input. For example, you must handle errors such as an EOFoccurring in the middle of a string or comment, as well as string constants that are too long. These are just someof the errors that can occur; see the manual for the rest.You must make some provision for graceful termination if a fatal error occurs. Core dumps are unacceptable.Your scanner should maintain the global variable curr lineno that indicates which line in the source text iscurrently being scanned. This feature will aid the parser in printing useful error messages.Each call on the scanner returns the next token and lexeme from the input. The value returned by the functioncool yylex is an integer code representing the syntactic category: whether it is an integer literal, semicolon, theif keyword, etc. The codes for all tokens are defined in the file cool-parse.h. The second component, the semanticvalue, is placed in the global union cool yylval, which is of type YYSTYPE. The type YYSTYPE is also definedin cool-parse.h. The tokens for single character symbols (e.g., “;” and “,”, among others) are represented just bythe integer value of the character itself. All of the single character tokens are listed in the grammar for Cool in theCoolAid.Programs tend to have many occurrences of the same lexemes. For example, an identifier generally is referredto more than once in a program (or else it isn’t very useful!). To save space and time, a common compiler practiceis to store lexemes in a string table. We provide you with a string table package, which is discussed in detail inA Tour of the Cool Support Code and documented in the code for /usr/local/lib/cool/include/PA2/stringtab.h and./stringtab.cc@. For the moment, we only need to know that the type of string table entries is Symbol.For class identifiers, object identifiers, integers and strings, the semantic value should be a Symbol stored in thefield cool yylval.symbol. For boolean constants, the semantic value is stored in the field cool yylval.boolean.Except for errors (see below), the lexemes for the other tokens do not carry any interesting information.All errors will be passed along to the parser, which is better equipped to handle them. The Cool parser knowsabout a special error token called ERROR. When an invalid character is encountered, that character and anyinvalid characters that follow should be gathered together into a string until the lexer finds a character that canbegin a new token. The routine cool yylex should return the token ERROR. The semantic value is the stringof illegal characters, which is stored in the field cool yylval.error msg (note that this field is an ordinary string,not a


View Full Document

Wright CS 780 - Assignment

Download Assignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Assignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Assignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?