UA CSC 453 - Study Notes - D466317

Home> Schools> University of Arizona> Computer Science Main (CSC) > CSC 453> Study Notes

UA CSC 453 - Study Notes

Course Csc 453- Compilers and Systems Software

Pages 13

Download Save

Unformatted text preview:

CSc 453 — Compilers and Systems Software19 : Code Generation IChristian CollbergDepartment of Computer ScienceUniversity of [email protected]° 2009 Christian CollbergOctober 27, 20091Introduction2 Compiler PhasesWe are here!ASTasmSemanticAnalyserInterm.Code GenIRMachineCode GenIRASTLexertokenssourceerrorsParsererrorsOptimizeerrors13Intermediate CodeGenerationSemantic Analysis,Intermediate CodeSeparation intoBasic Blocks; Flow analysis;Next−use infocomputation;X is definedhere:X is usedhere:Control−Flow GraphAssemblyCodeAssemblerMachineCodePeepholeOptimizationRegisterAssignmentRegisterAllocationRegisterSpillingInstructionSelectionInstructionSchedulingLexing, Parsing4 Code Generation Issues• The purpose of the code generation phase of the compiler is to transform the intermediate code pro-duced by the front end into some other code that can be executed.• Often the the code generator will produce assembly code or object code which (after assembly andlinking) can be directly executed by the hardware.• Alternatively, the code generator can generate C-code and use the native C-compiler as the “real”back-end.• Or, the code generator can generate code for a “virtual machine”, and use an interpreter to executethe code.• We expect the code generator to produce code that is as efficient as possible.5 Code Generation Issues. . .CodeTreeDAGt1 := a + bt2 := c + t1d := t2TuplesInstructionSelectionRegisterAllocationCode GenerationC−CodeGenerationC−CodeC−CompilerGet NextInstructionExecuteInstructionInterpretationIntermediate CodeAbsoluteMachine CodeAssembly CodeRelocatable26 Code Generation Issues. . .• The input to the code generator can be any one of the intermediate representations we’ve discussed:Trees, Tuples, Graphs,. . . The work of the code generator consists of several (interdependent) tasks:Instruction– selection: Which instructions should be generated?– scheduling: In which order should they be generated?Register– allocation: Which variables should be kept in registers?– assignment: In which registers should they be stored?– spilling: Which registers should be spilled when?7Architectures8 Machine Architectures—Instruction Sets3-Register: add R1, R2, R3[R1 := R2 + R3] (MIPS,VAX,· · · ).Register-Address: add R, Addr[R := R + Addr] (VAX,x86,MC68k)2-Register: add R1, R2[R1 := R1 + R2] (VAX,x86,MC68k)2-Address: add Addr1, Addr2[Addr1 := Addr1 + Addr2] (VAX)3-Address: add Addr1, Addr2, Addr3[Addr1 := Addr2 + Addr3] (VAX)9 Machine Architectures—Register ClassesGeneral One set of register that can hold any type of data (VAX, Alpha).Integer+Float Separate integer and floating point register sets (Sparc, MIPS).Integer+Float+Address Separate integer, floating point, and address register sets (MC68k).10 Machine Architectures—Addressing ModesImmediate: #XThe value of the constant X. (All architectures.)Register Direct: RThe contents of register R. (All.)Register Indirect: (R)The contents of the memory address in register R. (All.)3Register Indirect with increment: (R+) The contents of the memory address in register R. R is in-cremented by the size of the instruction (i.e. if MOVE.W (R+),Addr moves two bytes, then R would beincremented by 2). (VAX, MC68k.)Register Ind. with Displacement: d(R) The contents of the memory address R+d, where R is a registerand d a (small) constant. (All architectures.)11 Machine Architectures—Instruction Cost• The Cost of an instruction is the number of machine cycles it takes to execute it.• On RISCs, most instructions take 1 cycle to execute. Loads, stores, branches, multiplies, and dividesmay take longer.• On CISCs, the number of cycles required to execute an instruction Instr Op1, Op2is cost(Instr)+cost(Op1)+cost(Opcost(Opi) is the number of cycles required to compute the addressing mode Opi.12A Simple Example13 Example — Source• A straight-forward code generator considers one tuple at a time, without looking at other tuples. Thecode generator is simple, but the generated code is sub-optimal.The Source Program:int A[5], i, x;main(){for(i=1;i<=5;i++)x=x*A[i]+A[i];}14 Example — Intermediate Codeint A[5], i, x;main(){for(i=1;i<=5;i++) x=x*A[i]+A[i];}The Tuple Code(1) i := 1(2) T0 := i(3) IF T0<6 GOTO (5)(4) GOTO (17)(5) T1 := i(6) T2 := A[T1](7) T3 := x(8) T4 := T2*T3(9) T5 := i(10) T6 := A[T5](11) T7 := T4+T6(12) x := T7(13) T8 := i(14) T9 := T8+1(15) i := T9(16) GOTO (2)415 Example – Unoptimized MIPS Code(1) i := 1li $2,0x1 # $2 := 1sw $2,i # i := $2L2: (2) T0 := ilw $2,i # $2 := i(3) IF i < 6 GOTO (5)slt $3,$2,6 # $3 := i < 6bne $3,$0,L5 # IF $36=0 GOTO L5(4) GOTO (17)j L3 # GOTO L316L5: (5) T1 := ilw $2,i # $2 := CONT(i)(6) T2 := A[T1]move $3,$2 # $3 := $2sll $2,$3,2 # $2 := $3 * 4la $3,A # $3 := ADDR(A)addu $2,$2,$3 # $2 := $2 + $3lw $2,0($2) # $2 := CONT(A[i])(7) T3 := xlw $3,x # $3 := CONT(x);(8) T4 := T2 * T3mult $3,$2 # $lo := $3 * $2mflo $4 # $4 := $lo17(9) T5 := ilw $2,i # $2 := CONT(i)(10) T6 := A[T5]move $3,$2 # $3 := $2sll $2,$3,2 # $2 := $3 * 4la $3,A # $3 := ADDR(A)addu $2,$2,$3 # $2 := $2 + $3lw $3,0($2) # $2 := CONT(A[i])(11) T7 := T4 + T6addu $2,$4,$3 # $2 := $4 + $3(12) x := T7sw $2,x # x := $218(13) T8 := i5lw $3,i # $3 := CONT(i)(14) T9 := T8 + 1addu $2,$3,1 # $2 := $3 + 1move $3,$2 # $3 := $2(15) i := T9sw $3,i # i := $3(16) GOTO (2)j L2 # GOTO L2L3:19Common Sub-expression Elimination20 Example — After CSE• The generated code becomes a lot faster if we perform Common Sub-Expression Elimination (CSE)and keep the index variable i in a register ($6) over the entire loop:(1) i := 1li $6,0x1 # $6 := 1L2: (2) T0 := i(3) IF i < 6 GOTO (5)slt $3,$6,6 # $3 := i < 6bne $3,$0,L5 # IF $36=0 GOTO L5(4) GOTO (17)j L3 # GOTO L321• A[T1] is computed once, and the result is kept in register $5 until it’s needed the next time.L5:(5) T1 := i(6) T2 := A[T1]move $3,$6 # $3 := $6sll $2,$3,2 # $2 := $3 * 4la $3,A # $3 := ADDR(A)addu $2,$2,$3 # $2 := $2 + $3lw $5,0($2) # $5 := CONT(A[i])(7) T3 := xlw $3,x # $3 := CONT(x);(8) T4 := T2 * T3mult $3,$5 # $lo := $3 * $5mflo $4 # $4 := $lo22• After the loop we need to store $6 back into i.6(9) T5 := i(10) T6 := A[T5](11) T7 := T4 + T6addu $2,$4,$5 # $2 := $4 + $5(12) x := T7sw $2,x # x := $2(13) T8 := i(14) T9 := T8 + 1(15) i := T9addu $6,$6,1 # $6 := $6 + 1(16) GOTO (2)j L2 # GOTO L2L3:sw $6,i # i := $623More Optimization24 Example — More Register Allocation• Since x and ADDR(A) seem to be used a lot in the loop, we keep

View Full Document


School:
Email:
New Password:
Confirm Password:

UA CSC 453 - Study Notes

Sign up for free to view:

Please select your school