Code Shape I Expressions & ArraysCode Shape • What if x is 2 and z is 3? • What if y+z is evaluated earlier? The “best” shape for x+y+z depends on contextual knowledge → There may be several conflicting options x + y + z x + y → t1 t1+ z → t2 x + z → t1 t1+ y → t2 y + z → t1 t1+ z → t2 + z y x + + z y x + + y z x + + x z y Addition is commutative & associative for integersCode Shape Another example -- the case statement • Implement it as cascaded if-then-else statements → Cost depends on where your case actually occurs → O(number of cases) • Implement it as a binary search → Need a dense set of conditions to search → Uniform (log n) cost • Implement it as a jump table → Lookup address in a table & jump to it → Uniform (constant) cost Compiler must choose best implementation strategy No amount of massaging or transforming will convert one into anotherGenerating Code for Expressions The key code quality issue is holding values in registers • When can a value be safely allocated to a register? → When only 1 name can reference its value → Pointers, parameters, aggregates & arrays all cause trouble • When should a value be allocated to a register? → When it is both safe & profitable Encoding this knowledge into the IR • Use code shape to make it known to every later phase • Assign a virtual register to anything that can go into one • Load or store the others at each reference Relies on a strong register allocatorGenerating Code for Expressions The concept • Use a simple treewalk evaluator • Bury complexity in routines it calls > base(), offset(), & val() • Implements expected behavior > Visits & evaluates children > Emits code for the op itself > Returns register with result • Works for simple expressions • Easily extended to other operators • Does not handle control flow expr(node) { int result, t1, t2; switch (type(node)) { case ×,÷,+,- : t1← expr(left child(node)); t2← expr(right child(node)); result ← NextRegister(); emit (op(node), t1, t2, result); break; case IDENTIFIER: t1← base(node); t2← offset(node); result ← NextRegister(); emit (loadAO, t1, t2, result); break; case NUMBER: result ← NextRegister(); emit (loadI, val(node), none, result); break; } return result; }Generating Code for Expressions Example: Produces: + x y expr(“x”) → loadI @x ⇒ r1 loadAO r0, r1 ⇒ r2 expr(“y”) → loadI @y ⇒ r3 loadAO r0, r3 ⇒ r4 expr(“+”) → NextRegister() → r5 emit(add,r2,r4,r5) → add r2, r4 ⇒ r5 expr(node) { int result, t1, t2; switch (type(node)) { case ×,÷,+,- : t1← expr(left child(node)); t2← expr(right child(node)); result ← NextRegister(); emit (op(node), t1, t2, result); break; case IDENTIFIER: t1← base(node); t2← offset(node); result ← NextRegister(); emit (loadAO, t1, t2, result); break; case NUMBER: result ← NextRegister(); emit (loadI, val(node), none, result); break; } return result; }Generating Code for Expressions Example: Generates: - × x y 2 loadI @x ⇒ r1loadAO r0, r1 ⇒ r2loadI 2⇒ r3loadI @y ⇒ r4loadAO r0,r4 ⇒ r5mult r3, r5 ⇒ r6sub r2, r6 ⇒ r7expr(node) { int result, t1, t2; switch (type(node)) { case ×,÷,+,- : t1← expr(left child(node)); t2← expr(right child(node)); result ← NextRegister(); emit (op(node), t1, t2, result); break; case IDENTIFIER: t1← base(node); t2← offset(node); result ← NextRegister(); emit (loadAO, t1, t2, result); break; case NUMBER: result ← NextRegister(); emit (loadI, val(node), none, result); break; } return result; }Extending the Simple Treewalk Algorithm More complex cases for IDENTIFIER • What about values in registers? → Modify the !IDENTIFIER case → Already in a register ⇒ return the register name → Not in a register ⇒ load it as before, but record the fact → Choose names to avoid creating false dependences • What about parameter values? → Many linkages pass the first several values in registers → Call-by-value ⇒ just a local variable with “funny” offset → Call-by-reference ⇒ needs an extra indirection • What about function calls in expressions? → Generate the calling sequence & load the return value → Severely limits compiler’s ability to reorder operationsExtending the Simple Treewalk Algorithm Adding other operators • Evaluate the operands, then perform the operation • Complex operations may turn into library calls • Handle assignment as an operator Mixed-type expressions • Insert conversions as needed from conversion table • Most languages have symmetric & rational conversion tables +IntegerRealDoubleComplexIntegerIntegerRealDoubleComplexRealRealRealDoubleComplexDoubleDoubleDoubleDoubleComplexComplexComplexComplexComplexComplexTypical Addition TableHandling Assignment (just another operator) lhs ← rhs Strategy • Evaluate rhs to a value (an rvalue) • Evaluate lhs to a location (an lvalue) → lvalue is a register ⇒ move rhs → lvalue is an address ⇒ store rhs • If rvalue & lvalue have different types → Evaluate rvalue to its “natural” type → Convert that value to the type of *lvalue Unambiguous scalars go into registers Ambiguous scalars or aggregates go into memory Let hardware sort out the addresses !Handling Assignment What if the compiler cannot determine the rhs’s type ? • This is a property of the language & the specific program • If type-safety is desired, compiler must insert a run-time check • Add a tag field to the data items to hold type information Code for assignment becomes more complex evaluate rhs!if type(lhs) ≠ rhs.tag! then !
View Full Document