The Compilation ProcessIntermediate Representations/FormatsIntroduction to OptimizationAssembly Code and AssemblersOptimization: Register AllocationSeparate Compilation and LinkingShared Libraries and Dynamic LinkingGene rating Code and Running ProgramsStephen A. EdwardsColumbia UniversityFall 2008Part IThe Compilation ProcessA Long K’s Journey into Byte†Compiler front endSource code↓Parser/Semantic AnalysisASTCompiler back end↓Intermediate code generationIR↓OptimizationAssembly CodeAssembler½↓AssembleRelocatable Object CodeLinker½↓LinkExecutableLoader½↓ReloateIn-memory image†Apologies to O’NeillCompiler Frontends and BackendsThe front end focuses on analysis:ÏLexical analysisÏParsingÏStatic semantic checkingÏAST generationThe back end focuses on synthesis:ÏTranslation of the AST into intermediate codeÏOptimizationÏGeneration of assembly codePortable CompilersBuilding a compiler a large undertaking; most try to leverage it bymaking it portable.Instead ofC MIPSC++ SPARCFORTRAN x86Objective C AlphaAda-95 68kPascal PPCPortable CompilersUse a common intermediate representation.C MIPSC++ SPARCFORTRAN IR x86Objective C AlphaAda-95 68kPascal PPC| {z }Language-specificFrontends|{z }Processor-specificBackendsPart IIInterm ediate Representations/FormatsStack-Based IR: Java Bytecodeint gcd(int a, int b) {while (a != b) {if (a > b)a -= b;elseb -= a;}return a;}# javap -c GcdMethod int gcd(int, int)0 goto 193 iload_1// Push a4 iload_2 // Push b5 if_icmple 15 // if a <= b goto 158 iload_1 // Push a9 iload_2 // Push b10 isub // a - b11 istore_1 // Store new a12 goto 1915 iload_2// Push b16 iload_1 // Push a17 isub // b - a18 istore_2 // Store new b19 iload_1 // Push a20 iload_2 // Push b21 if_icmpne 3 // if a != b goto 324 iload_1 // Push a25 ireturn // Return aStack-Based IRsAdvantages:ÏTrivial translation of expressionsÏTrivial interpretersÏNo problems with exhausting registersÏOften compactDisadvantages:ÏSemantic gap between stack operations and modern registermachinesÏHard to see what communicates with whatÏDifficult representation for optimizationRegister-Based IR: Mach SUIFint gcd(int a, int b) {while (a != b) {if (a > b)a -= b;elseb -= a;}return a;}gcd:gcd._gcdTmp0:sne $vr1.s32 <- gcd.a,gcd.bseq $vr0.s32 <- $vr1.s32,0btrue $vr0.s32,gcd._gcdTmp1// if !(a != b) goto Tmp1sl $vr3.s32 <- gcd.b,gcd.aseq $vr2.s32 <- $vr3.s32,0btrue $vr2.s32,gcd._gcdTmp4// if !(a < b) goto Tmp4mrk 2, 4 // Line number 4sub $vr4.s32 <- gcd.a,gcd.bmov gcd._gcdTmp2 <- $vr4.s32mov gcd.a <- gcd._gcdTmp2// a = a - bjmp gcd._gcdTmp5gcd._gcdTmp4:mrk 2, 6sub $vr5.s32 <- gcd.b,gcd.amov gcd._gcdTmp3 <- $vr5.s32mov gcd.b <- gcd._gcdTmp3// b = b - agcd._gcdTmp5:jmp gcd._gcdTmp0gcd._gcdTmp1:mrk 2, 8ret gcd.a// Return aRegister-Based IRsMost common type of IRAdvantages:ÏBetter representation for register machinesÏDataflow is usually clearDisadvan tages:ÏSlightly harder to synthesize from codeÏLess compactÏMore complicated to interpretPart IIIIntroduction to OptimizationOptimization In Actionint gcd(int a, int b) {while (a != b) {if (a < b) b -= a;else a -= b;}return a;}GCC on SPARCgcd: save %sp, -112, %spst %i0, [%fp+68]st %i1, [%fp+72].LL2: ld [%fp+68], %i1ld [%fp+72], %i0cmp %i1, %i0bne .LL4nopb .LL3nop.LL4: ld [%fp+68], %i1ld [%fp+72], %i0cmp %i1, %i0bge .LL5nopld [%fp+72], %i0ld [%fp+68], %i1sub %i0, %i1, %i0st %i0, [%fp+72]b .LL2nop.LL5: ld [%fp+68], %i0ld [%fp+72], %i1sub %i0, %i1, %i0st %i0, [%fp+68]b .LL2nop.LL3: ld [%fp+68], %i0retrestoreGCC -O7 on SPARCgcd: cmp %o0, %o1be .LL8nop.LL9: bge,a .LL2sub %o0, %o1, %o0sub %o1, %o0, %o1.LL2: cmp %o0, %o1bne .LL9nop.LL8: retlnopTypical OptimizationsÏFolding constant expressions1+3 → 4ÏRemoving dead codeif (0) { . . . } → nothingÏMoving variables from memory to registersld [%fp+68], %i1sub %i0, %i1, %i0st %i0, [%fp+72]→sub %o1, %o0, %o1ÏRemoving unnecessary data movementÏFilling branch delay slots (Pipelined RISC processors)ÏCommon subexpression elimination;Machine-Dependent vs. -Independent OptimizationNo matter what the machine is, folding constants and eliminatingdead code is always a good idea.a = c + 5 + 3;if (0 + 3) {b = c + 8;}→b = a = c + 8;However, many optimizations are processor-specific:ÏRegister allocation depends on how many registers themachine h asÏNot all processors have branch delay slots to fillÏEach processor’s pipeline is a little differentBasic Blocksint gcd(int a, int b) {while (a != b) {if (a < b) b -= a;else a -= b;}return a;}lower→A: sne t, a, bbz E, tslt t, a, bbnz B, tsub b, b, ajmp CB: sub a, a, bC: jmp AE: ret asplit→A: sne t, a, bbz E, tslt t, a, bbnz B, tsub b, b, ajmp CB: sub a, a, bC: jmp AE: ret aThe statements in a basic block all run if the first one does.Starts with a statement following a conditional branch or is abranch target.Usually ends with a control-transfer statement.Control-Flow GraphsA CFG illustrates the flow of control among basic blocks.A: sne t, a, bbz E, tslt t, a, bbnz B, tsub b, b, ajmp CB: sub a, a, bC: jmp AE: ret aA: sne t, a, bbz E, tslt t, a, bbnz B, tsub b, b, ajmp CB: sub a, a, bE: ret a C: jmp APart IVAssembly Code and AssemblersAssembly CodeMost compilers produce assembly code: easy to debug.! gcd on the SPARCCommentgcd:cmpOpcode%o0, %o1Operand (a register)be .LL8nop.LL9:Labelble,a .LL2Conditional branch to a labelsub %o1, %o0, %o1sub %o0, %o1, %o0.LL2:cmp %o0, %o1bne .LL9nop.LL8:retlnopNo operationRole of an AssemblerTranslate opcodes + operand into byte codesgcd:0000Address80A20009Instruction codecmp %o0, %o10004 02800008 be .LL80008 01000000 nop.LL9:000c 24800003 ble,a .LL20010 92224008 sub %o1, %o0, %o10014 90220009 sub %o0, %o1, %o0.LL2:0018 80A20009 cmp %o0, %o1001c 12BFFFFC bne .LL90020 01000000 nop.LL8:0024 81C3E008 retl0028 01000000 nopEncoding Examplesub %o1, %o0, %o1Encoding of “SUB” on the SPARC:10 rd 000100 rs1 0 reserved rs231 29 24 18 13 12 4rd = %o1 = 01001rs1 = %o1 = 01001rs2 = %o0 = 0010010 01001 000100 01001 0 00000000 010001001 0010 0010 0010 0100 0000 0000 1000= 0x92228004Role of an AssemblerTransforming symbolic addresses to concrete ones.Example: Calculating PC-relative branch offsets.000c 24800003LL2 is 3 words awayble,a .LL20010 92224008 sub %o1, %o0, %o10014 90220009 sub %o0, %o1, %o0.LL2:0018 80A20009 cmp %o0, %o1Role of an AssemblerMost assemblers a re “two-pass” because they can’t calculateeverything in a single pass through the code..LL9:000c 24800003 ble,a .LL2Don’t know
View Full Document