Columbia COMS W4115 - Generating Code and Running Programs - D837168

Home> Schools> Columbia University> (COMS) > COMS W4115> Generating Code and Running Programs

Columbia COMS W4115 - Generating Code and Running Programs

Course Coms W4115- Programming Languages and Translators

Pages 50

Download Save

Unformatted text preview:

Generating Code and Running Programs Stephen A Edwards Columbia University Fall 2008 Part I The Compilation Process A Long K s Journey into Byte Compiler front end Compiler back end Assembler Linker Loader Apologies to O Neill Source code Parser Semantic Analysis AST Intermediate code generation IR Optimization Assembly Code Assemble Relocatable Object Code Link Executable Reloate In memory image Compiler Frontends and Backends The front end focuses on analysis Lexical analysis Parsing Static semantic checking AST generation The back end focuses on synthesis Translation of the AST into intermediate code Optimization Generation of assembly code Portable Compilers Building a compiler a large undertaking most try to leverage it by making it portable Instead of C C FORTRAN Objective C MIPS SPARC x86 Alpha Ada 95 68k Pascal PPC Portable Compilers Use a common intermediate representation C MIPS C FORTRAN SPARC IR Objective C x86 Alpha Ada 95 68k Pascal PPC z Language specific Frontends z Processor specific Backends Part II Intermediate Representations Formats Stack Based IR Java Bytecode javap c Gcd int gcd int a int b while a b if a b a b else b a return a Method int gcd int int 0 goto 19 3 iload 1 Push a 4 iload 2 Push b 5 if icmple 15 if a b goto 15 8 9 10 11 12 iload 1 iload 2 isub istore 1 goto 19 Push a Push b a b Store new a 15 16 17 18 iload 2 iload 1 isub istore 2 Push b Push a b a Store new b 19 iload 1 20 iload 2 21 if icmpne 3 Push a Push b if a b goto 3 24 iload 1 25 ireturn Push a Return a Stack Based IRs Advantages Trivial translation of expressions Trivial interpreters No problems with exhausting registers Often compact Disadvantages Semantic gap between stack operations and modern register machines Hard to see what communicates with what Difficult representation for optimization Register Based IR Mach SUIF int gcd int a int b while a b if a b a b else b a return a gcd gcd gcdTmp0 sne vr1 s32 gcd a gcd b seq vr0 s32 vr1 s32 0 btrue vr0 s32 gcd gcdTmp1 if a b goto Tmp1 sl vr3 s32 gcd b gcd a seq vr2 s32 vr3 s32 0 btrue vr2 s32 gcd gcdTmp4 if a b goto Tmp4 mrk 2 4 Line number 4 sub vr4 s32 gcd a gcd b mov gcd gcdTmp2 vr4 s32 mov gcd a gcd gcdTmp2 a a b jmp gcd gcdTmp5 gcd gcdTmp4 mrk 2 6 sub vr5 s32 gcd b gcd a mov gcd gcdTmp3 vr5 s32 mov gcd b gcd gcdTmp3 b b a gcd gcdTmp5 jmp gcd gcdTmp0 gcd gcdTmp1 mrk 2 8 ret gcd a Return a Register Based IRs Most common type of IR Advantages Better representation for register machines Dataflow is usually clear Disadvantages Slightly harder to synthesize from code Less compact More complicated to interpret Part III Introduction to Optimization Optimization In Action int gcd int a int b while a b if a b b a else a b return a GCC on SPARC GCC O7 on SPARC gcd gcd LL2 LL4 LL5 LL3 save sp 112 sp st i0 fp 68 st i1 fp 72 ld fp 68 i1 ld fp 72 i0 cmp i1 i0 bne LL4 nop b LL3 nop ld fp 68 i1 ld fp 72 i0 cmp i1 i0 bge LL5 nop ld fp 72 i0 ld fp 68 i1 sub i0 i1 i0 st i0 fp 72 b LL2 nop ld fp 68 i0 ld fp 72 i1 sub i0 i1 i0 st i0 fp 68 b LL2 nop ld fp 68 i0 ret restore cmp be nop LL9 bge a sub sub LL2 cmp bne nop LL8 retl nop o0 o1 LL8 LL2 o0 o1 o0 o1 o0 o1 o0 o1 LL9 Typical Optimizations Folding constant expressions 1 3 4 Removing dead code if 0 nothing Moving variables from memory to registers ld sub st fp 68 i1 i0 i1 i0 sub i0 fp 72 o1 o0 o1 Removing unnecessary data movement Filling branch delay slots Pipelined RISC processors Common subexpression elimination Machine Dependent vs Independent Optimization No matter what the machine is folding constants and eliminating dead code is always a good idea a c 5 3 if 0 3 b c 8 b a c 8 However many optimizations are processor specific Register allocation depends on how many registers the machine has Not all processors have branch delay slots to fill Each processor s pipeline is a little different Basic Blocks A sne t a b bz E t int gcd int a int b while a b if a b b a else a b return a lower A sne bz slt bnz sub jmp B sub C jmp E ret t E t B b C a A a a b t a b t b a a b slt t a b bnz B t split sub b b a jmp C B sub a a b C jmp A E ret a The statements in a basic block all run if the first one does Starts with a statement following a conditional branch or is a branch target Usually ends with a control transfer statement Control Flow Graphs A CFG illustrates the flow of control among basic blocks A sne t a b bz E t slt t a b bnz B t A sne t a b bz E t slt t a b bnz B t sub b b a jmp C sub b b a jmp C B sub a a b B sub a a b C jmp A E ret a E ret a C jmp A Part IV Assembly Code and Assemblers Assembly Code Most compilers produce assembly code easy to debug gcd on the SPARC Comment gcd Operand a register Opcode o0 o1 cmp be LL8 nop LL9 Label ble a LL2 Conditional branch to a label sub sub LL2 cmp bne nop LL8 retl nop o1 o0 o1 o0 o1 o0 o0 o1 LL9 No operation Role of an Assembler Translate opcodes operand into byte codes Instruction code Address gcd cmp o0 o1 0000 80A20009 0004 02800008 be LL8 0008 01000000 nop LL9 000c 24800003 ble a LL2 0010 92224008 sub o1 o0 o1 0014 90220009 sub o0 o1 o0 LL2 0018 80A20009 cmp o0 o1 001c 12BFFFFC bne LL9 0020 01000000 nop LL8 0024 81C3E008 retl 0028 01000000 nop Encoding Example sub o1 o0 o1 Encoding of SUB on the SPARC 10 31 rd 29 000100 24 rs1 18 0 13 reserved 12 rd o1 01001 rs1 o1 01001 rs2 o0 00100 10 01001 000100 01001 0 00000000 01000 1001 0010 0010 0010 0100 0000 0000 1000 0x92228004 rs2 4 Role of an Assembler Transforming symbolic addresses to concrete ones Example Calculating PC relative branch offsets LL2 is 3 words away 000c 24800003 0010 92224008 0014 90220009 ble a LL2 sub o1 o0 o1 sub o0 o1 o0 LL2 0018 80A20009 cmp o0 o1 Role of an Assembler Most assemblers are two …

View Full Document

Columbia COMS W4115 - Generating Code and Running Programs

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

Columbia COMS W4115 - Generating Code and Running Programs

Sign up for free to view:

Please select your school