Local Optimizations15-745 Optimizing CompilersSpring 2006Peter LeeRemindersTask 0 due on ThursdayFind a partnerRead up onintermediate representations (4.1-4.4)local optimizations (12.1, 12.3-4)Extent of optimizationsLocal optimizations work entirely on individual basic blocks, in isolationGlobal optimizations consider whole procedure bodiesInterprocedural optimizations work across procedure call boundariesWhole program optimizations, ...Constant FoldingConstant foldingEvaluation of integer and boolean expressions whose operands are constantsin some languages, must not remove division by zero or integer overflowSome compilers will also treat floatsEasily applied to linearized code, in conjunction with constant propagation, copy propagation, and value numberingAlgebraic SimplificationsInteger/boolean simplificationsi+0 = 0+i = i-0 = i0-i = -ii*1 = 1*i = i/1 = ii*0 = 0*i = 0-(-i) = ii+(-j) = i-jb or true = true or b = trueb or false = false or b = bb and true = true and b = bb and false = false and b = falseAddress Arithmetic SimplificationsAssociativityGenerally speaking, re-associating expressions is not safe with respect to overflowsE.g., (i-j) + (i-j) = 2*i - 2*jBut address arithmetic is assumed to be safe with respect to overflow, and so re-association can be appliedFurthermore, the compiler can control the form of address arithmetic that appearsAddress arithmetic exampleThe classic example (in Pascal):var a : array[lo1..hi1, lo2..hi2] of eltype;i, j : integer; ...do j = lo2 to hi2 begin a[i,j] := b + a[i,j]endAddress arithmeticCompiler generates the following address computation for a[i,j]:base_a + ((i-lo1) * (hi2-lo2+1) + j-lo2) * wFirst, re-associate to get:-(lo1 * (hi2-lo2+1) - lo2) * w + base_a +(hi2-lo2+1) * i * w +j * wall constant!loop invariant!w is a power of 2 (hence shift)Simplifying address arithmeticBecause the compiler uses a specific idiom for most address computations, it is usually possible to find a set of simplification rules (and an order for applying them) that will successfully simply themUsually expressed as IR tree transform rulestry each rule, in orderwhen one applies, apply it, and then start overFrom Muchnick1.c1+c2 ! sum(c1,c2)2.t+c ! c+t3.c1*c2 ! prod(c1,c2)4.t*c ! c*t5.c1-c2 ! diff(c1,c2)6.t-c ! -c + t7.t1+(t2+t3) ! (t1+t2)+t38.t1*(t2*t3) ! (t1*t2)*t39.(c1+t)+c2 ! sum(c1,c2)+t10.(c1*t)*c2 ! prod(c1,c2)*t11. (c1+t)*c2 ! prod(c1,c2)+(c2*t)12. c1*(c2+t) ! prod(c1,c2)+(c1*t)13. (t1+t2)*c ! (c*t1)+(c*t2)14. c* (t1+t2) ! (c*t1)+(c*t2)15. (t1-t2)*c ! (c*t1)-(c*t2)16. c* (t1-t2) ! (c*t1)-(c*t2)17. (t1+t2)*t3 ! (t1*t3)+(t2*t3)18. t1*(t2+t3) ! (t1*t2)+(t1*t3)19. (t1-t2)*t3 ! (t1*t3)-(t2*t3)20. t1*(t2-t3) ! (t1*t2)-(t1*t3)- sum and prod are the standard ops on ints- t, t1, t2, and t3 are arbitrary sub-trees- rules are attempted in orderValue NumberingValue numberingA classical local optimizationJohn Cocke and Jack Schwartz, “Programming Languages and their Compilers”, unpublished manuscript, 1970A very simple idea (reinvented many times):each value gets its own “number”, which is essentially a hash valuearrange for common subexpressions to evaluate to the same numberTerminology alert!Classical compiler optimization literature uses the words value and variable almost interchangeablyThis is because in the IR, we generally arrange for all values of interest to be assigned to a variable (i.e., a temp)So, one could say “variable numbering” instead of “value numbering”The hashGiven a statement of the formt = a bwe compute a value number for t by hashing on a, , and b.The value numbers of a and b are used to compute the value number of t if is commutative, the same hash should be returned for either argument orderExamplea = x + yb = x + yif !z goto L1x = !zc = x * yif x * y goto L2Hash on (+,x,y)No hit, so store a=x+y in tableHash on (+,x,y)Hit, so replace x+y with lhs of hash table entry a = x + yb = aif !z goto L1x = !zc = x * yif x * y goto L2Examplea = x + yb = aif !z goto L1x = !zc = x * yif x * y goto L2a = x + yb = at1 = !zif t1 goto L1x = !zc = x * yt2 = x * yif t2 goto L2Ensure all values are named...a = x + yb = at1 = !zif t1 goto L1x = t1c = x * yt2 = x * yif t2 goto L2a = x + yb = at1 = !zif t1 goto L1x = t1c = x * yt2 = cif t2 goto L2eliminate !zelim. x*yAnother examplei = readj = i + 1k = il = k + 1For this copy, k gets i’s value number......so the value numbers for j and l are also the same!i = readj = i + 1k = il = jWe’ll see later that common subexpression elim. fails here.CSE and value numbering are different!Another examplei = readj = i + 1k = ik = k * 2l = k + 1For this copy, k gets i’s value number......so the value numbers for j and l are different, and hence no optimization...but here k gets a new value number...No aliasing!Even for simple local optimizations we must be careful about aliasingTemps are “safe” in that they are non-addressable; unlike values in the heapexam1 (int *q) { int a, k; … k = a + 5; f(a, &k); *q = 13; m = a + 5; …}In this C program, this “a + 5” is redundant only if the previous two statements leave a and k unmodifiedValue numbering is local...j = i + 1k = i......l = k + 1......j = i + 1...Simple value numbering is valid only a a local optimizationWe’ll be able to extend it to global after applying the SSA transformation (later in
View Full Document