DOC PREVIEW
UA CSC 453 - Handout

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSc 453Compilers and Systems Software1 : Compiler OverviewDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergWhat does a compiler do?What’s a Compiler???At the very basic level a compiler translates a computerprogram from source code to some kind of executable code:errorsVMasmsourceCompilerx.classx.sx.javax.cOften the source code is simply a text file and the executablecode is a resulting assembly language program: gcc -S x.creads the C source file x.c and generates an assembly codefile x.s. Or the output can be a virtual machine code:javac x.java produces x.class.What’s a Language Translator???A compiler is really a special case of a language translator.A translator is a program that transforms a “program” P1written in a language L1into a program P2written in anotherlanguage L2.Typically, we desire P1and P2to be semantically equivalent,i.e. they should behave identically.Example Language Translatorssource language translator target languageLATEXlatex2html−→ htmlPostscriptps2ascii−→ textFORTRANf2c−→ CC++cfront−→ CCgcc−→ assembly.classSourceAgain−→ Javax86 binaryfx32−→ Alpha binaryWhat’s a Language???A formal language is a notation for precisely communicatingideas.By formal we mean that we know exactly which “sentences”belong to the language and that every sentence has awell-defined meaning.A language is defined by specifying its syntax and semantics.The syntax describes how words can be formed into sentences.The semantics describes what those sentences mean.Example LanguagesEnglish is a natural, not a formal language. The sentenceMany missiles have many warheads.has multiple possible meanings.Programming languages: FORTRAN, LISP, Java, C++,. . .Text processing languages: LATEX, troff,. . .\begin{frame}\frametitle{Example Languages}\begin{itemize}\item English is a \redtxt{natural}, not a formallanguage. The sentence\end{itemize}\end{frame}Specification languages: VDM, Z, OBJ,. . .Compiler InputText File Common on Unix.Syntax Tree A structure editor uses its knowledge of the sourcelanguage syntax to help the user edit & run theprogram. It can send a syntax tree to the compiler,relieving it of lexing & parsing.Compiler OutputAssembly Code Unix compilers do this. Slow, but easy for thecompiler.Object Code .o-files on Unix. Faster, since we don’t have to callthe assembler.Executable Code Called a load-and-go-compiler.Abstract Machine Code Serves as input to an interpreter. Fastturnaround time.C-code Good for portability.Compiler TasksStatic Semantic Analysis Is the program (statically) correct? Ifnot, produce error messages to the user.Code Generation The compiler must produce code that can beexecuted.Symbolic Debug Information The compiler should produce adescription of the source program needed by symbolicdebuggers. Try man gdb.Cross References The compiler may produce cross-referencinginformation. Where are identifiers declared &referenced?Profiler Information Where does my program spend most of itsexecution time? Try man gprof .The structure of a compilerCompiler PhasesS Y N T H E S I SSemantic AnalysisSyntactic AnalysisLexical AnalysisIntermediate CodeGenerationCode OptimizationMachine CodeGenerationA N A L Y S I SCompiler Phases – Lexical analysisThe lexer reads the source file and divides the text into lexicalunits (tokens), such as:Reserved words BEGIN, IF,. . .identifiers x, StringTokenizer,. . .special characters +, ∗, −, ^,. . .numbers 30, 3.14,. . .comments (* text *),strings "text".Lexical errors (such as ’illegal character’, ’undelimitedcharacter string’, ’comment without end’) are reported.Lexical Analysis of EnglishThe sentenceThe boy’s cowbell won’t play.would be translated to the list of tokensthe, boy+possessive, cowbell, will, not, playLexical Analysis of JavaThe sentencex = 3.14 * (9.0+y);would be translated to the list of tokens<ID,x>, EQ, <FLOAT,3.14>, STAR, LPAREN,<FLOAT,9.0>, PLUS, <ID,y>, RPAREN, SEMICOLONCompiler Phases – Syntactic analysisSyntax analysis (parsing) determines the structure of asentence.The compiler reads the tokens produced during lexical analysisand builds anabstract syntax tree (AST), which reflects thehierarchical structure of the input program.Syntactic errors are reported to the user:’missing ;’,’BEGIN without END’Syntactic Analysis of EnglishThe sentenceThe boy plays cowbell.would be parsed into the treecowbellNPSVPDet N VNNPTheboy playsS=sentence, NP=noun phrase, VP=verb phrase,Det=determiner, N=noun, V=verb.Syntactic Analysis of JavaThe sentencex = 3.14 * (9.0+y);would be parsed into the ASTxAssignMULFLOAT3.14ADDFLOAT IDy9.0IDCompiler Phases – Semantic analysisThe AST produced during syntactic analysis is decorated withattributes, which are then evaluated. The attributes canrepresent any kind of information such as expression types.The compiler also collects information useful for future phases.Semantic errors are reported:’identifier not declared’,’integer type expected’.Semantic Analysis of EnglishIn the sentenceThe boy plays his cowbell.we determine that his refers to the boy.Semantic Analysis of JavaIn the sentencestatic float luftballons = 10;void P() {int luftballons = 99;System.out.println(luftballons);}the compiler must determinewhich luftballons the print-statement refers to,that float luftballons=10 has the wrong type.Compiler Phases – IR GenerationFrom the decorated AST this phase generates intermediatecode(IR).The IR adds an extra a level of abstraction between the highlevel AST and the very low level assembly code we want togenerate. This simplifies code generation.A carefully designed IR allows us to build compilers for anumber of languages and machines, by mixing and matchingfront-ends and back-ends.IR Generation of EnglishFrom the sentenceEvery boy has a cowbell.we could generate∀x; boy(x) ⇒ has-cowbell(x)IR Generation of JavaFrom the sentencex = 3.14 * (9.0+y);the compiler could generate the (stack-based) IR codepusha x, pushf 3.14, pushf 9.0,push y, add, mul, assignCompiler Phases – Code OptimizationThe (often optional) code optimization phase transforms anIR program into an equivalent but more efficient program.Typical transformations includecommon subexpression elimination only compute anexpression once, even if it occurs more thanonce in the source),inline expansion insert a procedure’s code at the call site toreduce call overhead,algebraic transformations A := A


View Full Document

UA CSC 453 - Handout

Download Handout
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Handout and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Handout 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?