DOC PREVIEW
UA CSC 453 - Lecture Notes

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSc 453 — Compilers and Systems Software13 : Intermediate Code IChristian CollbergDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergOctober 18, 20091Introduction2 Compiler PhasesWe are here!ASTasmSemanticAnalyserInterm.Code GenIRMachineCode GenIRASTLexertokenssourceerrorsParsererrorsOptimizeerrors3 Intermediate Representations• Some compilers use the AST as the only intermediate representation. Optimizations (code improve-ments) are performed directly on the AST, and machine code is generated directly from the AST.• The AST is OK for machine-independent optimizations, such as inlining (replacing a procedure callwith the called procedure’s code).• The AST is a bit too high-level for machine code generation and machine-dependent optimizations.14 Intermediate Representations. . .• For this reason, some compilers generate a lower level (simpler, closer to machine code) representationfrom the AST. This representation is used during code generation and code optimization.5 Intermediate CodeAdvantages of:1. Fitting many front-ends to many back-ends,2. Different development teams for front- and back-end,3. Debugging is simplified,4. Portable optimization.Requirements:1. Architecture independent,2. Language independent,3. Easy to generate,4. Easy to optimize,5. Easy to produce machine code from.6 Mix-and-Match CompilersINTERMEDIATE REPRESENTATIONPascalPascal68k−compilerAdaMips−compiler Mips−compilerENDRONTFAda Pascal Modula−2 C++BACKSparcMips68000IBM/370END7 UNCOL• A representation which is both architecture and language independent is known as an UNCOL, aUniversal Compiler Oriented Language.• UNCOL is the holy grail of compiler design – many have search for it, but no-one has found it.Problems:1. Programming language semantics differ from one language to another,2. Machine architectures differ.28 Intermediate Code. . .• There are several different types of intermediate representations:1. Tree-Based.2. Graph-Based.3. Tuple-Based.4. Linear representations.• All representations contain the same information. Some are easier to generate, some are easy togenerate simple machine code from, some are easy to generate good code from.• IR — Intermediate Representation.9Postfix Notation10 Postfix Notation*assignba 2 a 2+*Infix: b := (a ∗ 2) + (a ∗ 2)Postfix: b a 2 * a 2 * + :=• Postfix notation is a parenthesis free notation for arithmetic expression. It is essentially a linearizedrepresentation of an abstract syntax tree.• In postfix notation an operator appears after its operands.• Very simple to generate, very compact, easy to generate straight-forward machine code from, difficultto generate good machine code from.11Tree & DAG Representations12 Tree & DAG Representation• Trees make good intermediate representations. We can represent the program as a sequence of ex-pression trees. Each assignment, procedure call, or jump becomes one individual tree in the forest.• Common Subexpression Elimination (CSE): Even if the same (sub-) expression appears more thanonce in a procedure, we should only compute its value once, and save the result for future reference.• One way of doing this is to build a graph representation, rather than a tree. In the following slides wesee how the expression a ∗ 2 gets two subtrees in the tree representation and one subtree in the DAGrepresentation.313 Tree & DAG Representation. . .b := (a ∗ 2) + (a ∗ 2)*assignba 2 a 2+*Linearized TreeNrOp Arg1Arg21 ident a2 int 23 mul 1 24 ident a5 int 26 mul 4 57 add 3 68 ident b9 assign 8 714 Tree & DAG Representation. . .b := (a ∗ 2) + (a ∗ 2)*assignb +a 2Linearized DagNrOp Arg1Arg21 ident a2 int 23 mul 1 24 add 3 35 ident b6 assign 5 415 Sequence of Expression TreesX := 20;WHILE X < 10 DOX := X-1;A[X] := 10;IF X = 4 THENX := X - 2;ENDIF;ENDDO;Y := X + 5;XX:=20gotoB3:=XY+5gotoEXIT>=10B4X:=X−1X[]A X:=10X4B6<>:=X−2gotoB2EXIT:B4:B3:B5:B6:B2:16 Building DAGs. . .• Repeatedly add subtrees to build DAG. Only add subtrees not already in DAG. Store subtrees in ahash table. This is the value-number algorithm.4•For every insertionof a subtree, check if(X OP Y ) ∈ DAG.X YOPPROCEDURE InsertNode (OP : Operator; L, R : Node) : Node;BEGINV := hashfunc (OP, L, R);N := HashTab.Lookup (V , OP, L, R);IF N = NullNode THENN := NewNode (OP, L, R);HashTab.Insert (V , N );END;RETURN N ;END InsertNode;17 Building DAGs – Example34625A+BAB+AB134625A+BAB(A+B)*CC+ABC*134625+ABC*+(A+B)*C+(A+B)A+BAB(A+B)*CC118 Building DAGs• From an expression/expression tree such as the one on the left we might generate the machine code(for some fictitious architecture) on the right:a ∗ (b + c)41235+a*cbLOAD b, r0LOAD c, r1ADD r0, r1, r2LOAD a, r3MUL r2, r3, r4• Can we generate better code from a DAG than a tree?19 Building DAGs. . .Example Expression:[(a + b) ∗ c + {(a + b) + e} ∗ (e + f)] ∗ [(a + b) ∗ c]5Tree Representation:++a bc*+e f+a b+e*+a bc**20 Building DAGs. . .[(a + b) ∗ c + {(a + b) + e} ∗ (e + f)] ∗ [(a + b) ∗ c]DAG Representation:e*++a bc* *+ +f21 Building DAGs. . .• Generating machine code from the tree yields 21 instructions.Code from TreeLOAD a, r0 ; aLOAD b, r1 ; bADD r0, r1, r2 ; a + bLOAD c, r0 ; cMUL r0, r2, r3 ; (a + b) ∗ cLOAD a, r0 ; aLOAD b, r1 ; bADD r0, r1, r2 ; a + bLOAD e, r0 ; eADD r2, r0, r4 ; (a + b) + eLOAD f, r0 ; fLOAD e, r1 ; eADD r0, r1, r0 ; f + eMUL r4, r0, r4 ; {(a + b) + e}∗; (e + f)ADD r3, r4, r4; Code for (a + b) ∗ c into r3MUL r4, r3, r022 Building DAGs. . .• Generating machine code from the DAG yields only 12 instructions.Code from DAGLOAD a, r0 ; aLOAD b, r1 ; bADD r0, r1, r2; a + bLOAD c, r0 ; cMUL r0, r2, r3 ; (a + b) ∗ cLOAD e, r4 ; eADD r4, r2, r1 ; (a + b) + eLOAD f, r0 ; fADD r0, r4, r0 ; f + eMUL r1, r0, r0ADD r0, r3, r0MUL r0, r3, r0623Tuple Codes24 Three-Address Code• Another common representation is three-address code. It is akin to assembly code, but uses aninfinite number of temporaries (registers) to store the results of operations.• There are three common realizations of three-address code: quadruples, triples and indirect triples.Types of 3-Addr Statements:x := y op z Binary arithmetic or logical operation. Example: Mul, And.x := op yUnary arithmetic, conversion, or logical operation. Example: Abs, UnaryMinus, Float.x := yCopy statement.25 Three-Address Code. . .goto LUnconditional jump.if x relop y goto LConditional jump. relop is one of <,>,<=, etc. If x relop


View Full Document

UA CSC 453 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?