Unformatted text preview:

CS541 class notesRaphael FinkelApril 17, 20141 IntroLecture 1, 1/16/2014• Handout 1 — My names• Mr. / Dr. / Professor / —• Raphael / Rafi / Refoyl• Finkel / Goldstein• Extra 5 minutes on every class to correct for missed sessions• Plagiarism — read aloud from handout 1• Assignments on web. The first is very easy, the rest not, so startimmediately .• E-mail list: [email protected]; instructor uses to reachstudents.• All students have MultiLab accounts, although you may use anycomputer you like to do assignments.• textbook — It is essential that you read ahead.2 Overview of compilers: Chapter 1• A compiler language is an example of a software tool.ImplementationUse (client)SpecProgrammerCompilerLanguage1CS541 Spring 2014 2• The compiler’s job.librariescompilerlinkerbyte codeP−codeinterpreterhardwarecompilerBasicPerl, Pascal, Java, .NETmachine−specificmachine−independentjust−in−timeslowload−decode−executefastload−decode−executemachine coderelocatable obect codeProgram insource languageresolved machine coderelocatable object codeexecutable image• Output type• pure machine code: specific to a given architecture, no runtimelinking. Example: Linux kernel.• augmented machine code: specific to a given architecture andenvironment. Example: C programs written for Linux, whichmay make OS calls.• virtual machine code, interpreted or compiled on the fly duringexecution. Examples: Java (JVM), C# (.NET). Advantages:portability, code size Our assignments use this output type.• Output format• Assembler: good for cross-compilation; avoids having thecompiler resolve all references. Modular compilation. Ourassignments use this output format.• Relocatable binary: defers resolving external references.Modular compilation. Very common; used by Java and C.• Absolute binary: all references resolved.• The organization of a compiler. Figure 1.4 page 15• Scanner: reads the source program and constructs a stream oftokens, removing comments, and processing directives such aslisting.CS541 Spring 2014 3• Example: if (a < 39) { is an input string of characters.The associated output tokens are if:reserved, (:symbol,a:identifier, <:operator, 39:integer, ):symbol, {:symbol.• The scanner can discover and report errors, such as 39f.• We describe tokens by regular expressions.• We recognize tokens by using a deterministic finiteautomaton (DFA). That automaton is built for us by ascanner generator tool such as lex, flex, or jflex. Ourassignments use jflex.• Parser: reads the token stream and creates an abstract syntaxtree (AST), verifying syntax and possibly repairing syntaxerrors.• Example: given the tokens above, the tree fragment wouldhave a root statement:if, with three children: thecondition, the “then” part, and the “else” part. Thecondition would be expression:<, with left childidentifier:a, right child integer:39.• The parser can discover and report errors, such as ]instead of ) in the example.• We describe the syntax by a context-free grammar (CFG).• The table that drives the scanner is built for us by a parsergenerator tool such as yacc, bison, or javaCUP. Ourassignments use javaCUP.• Type checker: navigates through the AST and verifies thatvariables are declared and that types are used consistently.• For instance, if a in the example is not of a numeric type,the type checker can report an error.• It can also modify the AST, for instance, introducingtype-conversion nodes, if, for instance, a is a short integer,in which case it might be converted to a regular integer.• Translator: navigates through the AST and generates eitherintermediate representation (IR) some other representation ofexecutable code. Our IR will be assembler for Java bytecode.• Optimizer: Analyzes the IR to improve the code. There aremany forms of optimization, such as simplifying expressions,moving code, re-using values, eliminating trivial arithmetic,replacing sequences of instructions.• Code generator: Maps the IR to target machine code. OurCS541 Spring 2014 4assignments use Jasper to generate the target machine code:Java bytecode.3 Programming language considerations• Successful designers of programming languages often have strongbackgrounds in constructing compilers. If it can’t be compiled, it’snot very useful.• Many features of modern languages require special care.• formal procedures (requiring closures)• passing by name (obsolete since Algol 60; requires thunks)• dynamic-sized arrays (requires runtime type descriptors)• nested name scopes (require static chains)• anonymous functions, first-class functions (requiring closures)• multiple-yield iterators (require special stack manipulation)• automatic reclamation of object store (requires garbagecollection).4 Computer architecture considerations• Lecture 2, 1/21/2014• How many registers? What operations use them? How manyregister classes?• Some operations can be very expensive: virtual method dispatch,dynamic heap access, reflective programming, exceptions, threads• The effect of memory architecture, such as paging and caches, isdifficult to present to programmers but is significant.5 Specialty compilers• Debugging support, including participation in an integrateddevelopment environment (IDE).• Highly optimizing compilers.• Retargetable compilers.CS541 Spring 2014 56 The ac (adding calculator) language:Chapter 2• Components• Types: integer and float• Keywords: f, i, p• Variables: lowercase Roman, excluding keywords• Context-free grammar (CFG), expressed in Backus-Naur Form(BNF) Figure 2.1 page 33• Parse tree for f b i a a = 5 b = a + 3.2 p b $Figure 2.4 page 37• This parse requires that we specify the syntax of tokens.Figure 2.3 page 367 The scanner• Translates a stream of characters (as above) into a stream of tokens.• A token has a type (such as operator or reserved) and a semanticvalue (such as plus or print).• It’s a matter of choice whether each operator has its own type, inwhich case there is no need for semantic values.• Likewise, one can choose that reserved words each have their owntype, or that they are of type reserved with a semantic value (theirspelling), or that they are of type id with a semantic value.• Hard-coded example Figure 2.5 page 40 uses peek() andadvance()• Production-quality scanners are constructed automatically fromregular expressions. We will discuss them in the next chapter.CS541


View Full Document

UK CS 541 - CS541 class notes

Download CS541 class notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS541 class notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS541 class notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?