DOC PREVIEW
Berkeley COMPSCI 164 - Assignment

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UNIVERSITY OF CALIFORNIADepartment of Electrical Engineeringand Computer SciencesComputer Science DivisionProf. R. FatemanFall, 2005CS 164 Assignment 3 and 4: Parsing for MiniJavaDue: Tuesday, Oct. 13, 2005, 11:59PM (assignment 3), Oct. 20, (assignment 4)This assignment requires you to examine and extend one parser and use a parser-building toolto program another parser. The first parser is a recursive-descent parser written in CommonLisp: a top-down parser without backup. The second parser, also in Common Lisp, will beproduced by an LALR syntax-directed parser generator.As you should realize by now, programming a top-down predictive parser is often astraightforward and intuitive task. While it is considered especially suitable for small lan-guages, this technique can also be used for rather ambitious programming languages. Inparticular, once you have written some stereotyped recursive programs for common styles ofgrammar productions, such a parser can be extended fairly simply. And you can do most ofthe extension by cutting and pasting. For support, it requires nothing much other than a rou-tine to read tokens (presumably done by your lexical analysis module). Almost any modernlanguage is suitable for writing such a parser, requiring only support for recursive functioncalls. Writing such a parser in Lisp is especially appealing since our emacs/lisp system pro-vides an environment for creating and testing such a parser “incrementally.” With such aprogramming approach it is relatively simple to identify and recover from many errors, or toinformally extend the grammar by allowing certain kinds of corrections: adding in missingpunctuation and giving very specific warnings or error messages. It does not require learning“extra tools” such as a parser generator. Not to say such tools should not be used, just thatunder the best of circumstances they tend to produce mysterious programs, and some of theprograms force you to learn and debug “specifications” in yet another language.On the other hand, as software tools go, parser generators are easy to find, (or build!). LL(1)top-down parser generators are not as commonly used as bottom-up LALR, mostly becausetypical programming language grammars may have just a few constructions whose simplestgrammar is not LL(1). Recall that LL parsers may require rearrangement of the productionsin the grammar (factoring and left-recursion removal), making the programmer’s job some-what clumsier: it is necessary to attach meaningful “augments” or “actions” to reductions. Infact our sample MiniJava parser (courtesy of David Bindel) is an LL(1) formulation showingthat MiniJava is susceptible to LL(1) parsing.1CS 164 Assignment 3 and 4: Parsing for MiniJava 2These days, for a programming language design to be considered respectable, the designersmust show that there is grammar for the language that is entirely LALR(1). Occasionallythere may be a good reason for a non-LALR(1) construction, but it would be a cause forapology. You will probably find that a grammar, once it is made non-ambiguous, is mostlyLALR(1). In the few cases where there are non-LALR spots, or even ambiguities, we canoften ignore the warnings: the parser can be set up to automatically do the “correct” thinganyway, in a way analogous to the “maximal munch” lexer rule that does the right thingthere.LALR generators are fairly easy to use, they produce fast and (reasonably) compactparsers, and (for the theoretically minded) can parse a substantially larger abstract class oflanguage constructs than the top-down methods. The use of automated parser generatorsforces you to use formal rather than ad hoc descriptions of your language and provides amore direct route from grammar design to implementation. That is, in the face of continuingexperimental changes to a language, a parser generator may be quite handy since your “turn-around time” to try each new grammar for consistency requires only the time to edit thegrammar and “remake” the parser. This may require only a few seconds. If the “remake”succeeds, not only have you confirmed the formal properties of the grammar, but you havea parser. You can also test fragments of a total grammar in pieces and then put them alltogether.Of these parser-generators, widely used systems include the ancient YACC “yet anothercompiler-compiler”. The equivalent BISON, available as “GNU” software, does not requirelicensing, and has been used in CS164 in the past. The text discusses a free Java tool calledSableCC which appears to resemble YACC with minor differences. (JavaCC, also mentioned,is an LL(1) generator.) There are several LALR parser-generators written in Common Lisp.The one we used in Fall, 2002 in CS164 was written by Mark Johnson of Brown Universityin 1988. It is free, and its source is on the class home page software directory. We’ve used itin several classes and appears to be free of bugs. It provides some diagnostic information onrequest, and it produces as output a reasonable (Lisp language) parser for the grammar yousupply. You could, if you wish, look at the code and see the pattern that makes it rather likea (sparse) table.For this assignment you are going to consider use both techniques: recursive descent andan LALR parser generator.1. In the class software/source directory are several versions of recursive descent parsersfor the grammars in the text, 3.11 and 3.15. Read and understand them in sequence,recdec.cl, recdec315.cl, recdec315d.cl, and finally recdec315s.cl. Your task is tochange recdec315s.cl so that it treats a larger grammar. This grammar should allow forassignment statements of the form id := expression and sequences like { expression ,expression, ...}, and the operators > and <. Note that a+b>c+d means (a+b)>(c+d) nota+(b>c)+d or some other variation.It would be good to make sure you have a clear idea of what is now legal, so you shouldwrite some grammar rules before adding these features to the parser. How do you deal witha:=b:=c or empty sequences {}? Note that these parsers have a trivial lexical analyzer,CS 164 Assignment 3 and 4: Parsing for MiniJava 3namely eat.2. In the software/build subdirectory, there are (compiled) copies of the parser generatorlalr.fasl for various systems. We have also provided a collection of macros, make-lalr.fasl,which provide a more compact interface to the lalr parser generator. The source code anddocumentation for the parser generator is in


View Full Document

Berkeley COMPSCI 164 - Assignment

Documents in this Course
Lecture 8

Lecture 8

40 pages

Load more
Download Assignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Assignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Assignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?