Wright State University Department of Computer Science and Engineering CS 780 Fall 2010 Prasad Assignment 2 Due Oct 18 10 pts 1 Overview The Cool programming assignments II V will lead you to design and build a compiler for Cool Each assignment will cover one component of the compiler lexical analysis parsing semantic analysis and code generation assignments IV and V will be given in the next offering of CS781 For this assignment you are to write a lexical analyzer also called a scanner or tokenizer using the tool flex You will describe the set of tokens for Cool in flex input format An HTML manual for flex is available at URL http www gnu org software flex manual html mono flex html You will be working in pairs on this assignment Even though it is a good idea to attempt the assignment on your own after discussions with your classmate you are required to turn in only one copy of the code and the writeup Clearly identify the members of your group in the README file 2 Files and Directories To get started you should type usr local bin make f usr local lib cool assignments PA2 Makefile in a directory where you want to do the assignment You can use gmake instead of make too This command will copy a number of files into your directory Some of the files will be copied as read only using symbolic links You should not edit these files In fact if you make and modify private copies of these files you may find it impossible to complete the assignment See the instructions in the README file The files that you will need to modify are cool flex This file contains a token start no pun intended at a flex description for Cool You can actually build a scanner with this description but it does not do much You should read the man pages for flex to figure out what this description does do Any auxiliary C routines that you wish to write should be added directly to the cool flex file after the last test cl This file contains some sample input to be scanned It does not exercise all of the lexical specification but it is nevertheless an interesting test It is not a good test to start with nor does it provide adequate testing of your scanner Part of your assignment is to come up with good testing inputs and testing strategy You should modify this file with tests that you think adequately exercises your scanner Our test cl is actually close to a real Cool program but your tests need not be You may keep as much or as little of our test as you like README This file contains detailed instructions for the assignment Although these files are incomplete the two program files do compile and run There are a number of useful tips on using flex in the README file All of the software supplied with this assignment is supported on the SPARC Solaris machine gandalf cs wright edu Important You should make sure to place usr local lib cool bin at the beginning of your path variable to make sure the executables used are the ones the assignments are designed for Similarly make sure to have usr local bin before bin and usr bin to pick the correct version of flex To do this add the line setenv PATH usr local lib cool bin usr local bin PATH 1 at the end of your login file or tcshrc file if you use tcsh add the line path usr local lib cool bin usr local bin path export path at the end of your profile file or bashrc file if you use bash 3 Scanner Results You should follow the specification of the lexical structure of Cool given in the Section 10 and Figure 1 of the CoolAid Manual In general we encourage you to use primitives supported by flex in preference to defining your own macros and using regular expressions and start conditions in preference to writing C code Your scanner should be robust it should work for any conceivable input For example you must handle errors such as an EOF occurring in the middle of a string or comment as well as string constants that are too long These are just some of the errors that can occur see the manual for the rest You must make some provision for graceful termination if a fatal error occurs Core dumps are unacceptable Your scanner should maintain the global variable curr lineno that indicates which line in the source text is currently being scanned This feature will aid the parser in printing useful error messages Each call on the scanner returns the next token and lexeme from the input The value returned by the function cool yylex is an integer code representing the syntactic category whether it is an integer literal semicolon the if keyword etc The codes for all tokens are defined in the file cool parse h The second component the semantic value is placed in the global union cool yylval which is of type YYSTYPE The type YYSTYPE is also defined in cool parse h The tokens for single character symbols e g and among others are represented just by the integer value of the character itself All of the single character tokens are listed in the grammar for Cool in the CoolAid Programs tend to have many occurrences of the same lexemes For example an identifier generally is referred to more than once in a program or else it isn t very useful To save space and time a common compiler practice is to store lexemes in a string table We provide you with a string table package which is discussed in detail in A Tour of the Cool Support Code and documented in the code for usr local lib cool include PA2 stringtab h and stringtab cc For the moment we only need to know that the type of string table entries is Symbol For class identifiers object identifiers integers and strings the semantic value should be a Symbol stored in the field cool yylval symbol For boolean constants the semantic value is stored in the field cool yylval boolean Except for errors see below the lexemes for the other tokens do not carry any interesting information All errors will be passed along to the parser which is better equipped to handle them The Cool parser knows about a special error token called ERROR When an invalid character is encountered that character and any invalid characters that follow should be gathered together into a string until the lexer finds a character that can begin a new token The routine cool yylex should return the token ERROR The semantic value is the string of illegal characters which is stored in the field cool yylval error msg note that this field is an ordinary string not a symbol For errors besides strings of invalid characters e g a string constant that is too long or an endof file inside of a comment it is
View Full Document
Unlocking...