✬✫✩✪Scalar Optimization1✬✫✩✪Organization1. What kind of optimizations are useful?2. Program analysis for determining opportunities for optimization:dataflow analysis:• lattice algebra• solving equations on lattices• applications to dataflow analysis3. Speeding up dataflow analysis:• exploitation of structure• sparse representations: control dependence, SSA form, sparsedataflow evaluator graphs2✬✫✩✪Optimizations performed by most compilers• Constant propagation: replace constant-valued variables withconstants• Common sub-expression elimination: avoid re-computing valueif value has been computed earlier in program• Loop invariant removal: move computations into less frequentlyexecuted portions of program• Strength reduction: replace expensive operations (likemultiplication) with simpler operations (like addition)• Dead code removal: eliminate unreachable code and code thatis irrelevant to output of program3✬✫✩✪Optimization example:DO J = 1,100DO I = 1,100A[I,J] = 1If we assume column-major order of storage, and 4 bytes per arrayelement,Address of A[I,J] = BaseAddress(A) + (J-1)*100*4 + (I-1)*4= BaseAddress(A) + J*400 + I*4 - 404• Only the term I ∗ 4 depends on I => rest of computation isinvariant in the inner loop and can be hoisted out of it.• Further hoisting of invariant code is possible since only the subtermJ ∗ 400 depends on J .• Since I and J are incremented by 1 each time through the loop,expressions like I ∗ 4 and J ∗ 400 can be strength reduced byreplacing them with additions.4✬✫✩✪Pseudo-code for original loop nest:DO J = 1,100DO I = 1,100t = BaseAddress(A) + J*400 + I*4 - 404STORE(t,1)Pseudo-code for optimized loop nest:t0 = BaseAddress(A) - 404DO J = 1,100t0 = t0 + 400t1 = t0DO I = 1,100t1=t1+4STORE(t,1)5✬✫✩✪Terminology• A definition of a variable is a statement that may assign tothat variable. Definitions of x:(i) x = 3(ii) ... F(x,y).. (call by reference)In second example, invocation of F may write to x,so to be safe, we declare invocation to be adefinition of x.• A use of a variable is a statement that may read thevalue of that variable. Uses of x:(i)y=x+3(ii) ...F(x,z)...6✬✫✩✪Aliasing: occurs in a program when two or more names refer to thesame storage location.Examples:(i)procedure f(vax:x,y).......f(z,z) ... f(a,b)...Within f, reference parameters x and y may be aliases!(ii)..x:=3;y := @x;*y := 5;....x...x and *y are aliases for the same location!7✬✫✩✪Our position:We will not perform analysis for variables that may be aliased such(reference parameters, local variables whose addresses have beentaken, etc.).This implies we can determine syntactically where all definitionsand uses of a variable are.More refined approach: perform alias analysis.8✬✫✩✪For the next few slides, we will focus on constant propagation toillustrate the general approach to dataflow analysis.Examples:(i)... ...x := 1; x:= 1;y := x + 2; y := 3;if x > z then y:= 5; fi; => if 1 > z then y:= 5; fi;...y... ...y...Constant propagation may simplify control flow as well:(ii)... ...x := 1; x:= 1;y := x + 2; y := 3; <-- dead codeif y > x then y:= 5; fi; => if true then y := 5; <-- simplify ...y... ...5...9✬✫✩✪Why do opportunities for constant propagation arise in programs?• constant declarations for modularity• macros• procedure inlining: small methods in OO languages• machine-specific values10✬✫✩✪Overview of algorithm:1. Build the control flow graph of program.makes flow of control in program explicit2. Perform “symbolic evaluation” to determine constants.3. Replace constant-valued variable uses by their values andsimplify expressions and control flow.11✬✫✩✪Step1: build the control flow graph (CFG).Example:...x:=1;y := x+2;if (y > x) then y := 5; fi;...y...12✬✫✩✪111313551x := 1STARTy:= x + 2y > xy := 5merge...y....control flow graph (CFG)state vectorson CFG edges13✬✫✩✪Algorithm for building CFG: easy, only complication being break’sand GOTO’s (need to identify jump targets)Basic Block: straight-line code without any branches or merging ofcontrol flowNodes of CFG: statements(or basic blocks)/switches/mergesEdges of CFG: represent possible control flow sequence14✬✫✩✪Symbolic evaluation of CFG for constant propagationPropagate values from following lattice:false true .... -1 0 1 ... definitely not a constantmay or may not be constantjoin(T,0) = Tjoin(0,-1) = Tmeet(0,-1) = meet(T,1) = 1Two operators:• join(a,b): lowest value above both a and b (also written asa ∪ b)• meet(a,b): highest value below both a and b (also written asa ∩ b)Symbolic interpretation of expressions: EVAL(e, Vin):ifanyargument of e is (or ⊥) in Vin, return (or ⊥ respectively);otherwise, evaluate e normally and return that value15✬✫✩✪1. Associate one state vector with each edge of the CFG,initializing all entries to ⊥. Initialize work-list to empty.2. Set each entry of state vector on edge out of START to , andplace this edge on the worklist.16✬✫✩✪3. while worklist is not empty doGet edge from worklist;Let state vector on edge be Vin;//Symbolically evaluate target node of the edge,//using the state vectors on its inputs,//and propagate result state vector to output edge of node;if (target node is assignment statement x:= e)Propagate Vin[EVAL(e,Vin)/x] to output edge;else if (target node is switch(p)){if EVAL(p,Vin) is T, Propagate Vin to all outputs of switch;else if EVAL(p,Vin) is true, Propagate Vin to true side of switch;else Propagate Vin to false side of switch;}else //target node is mergePropagate join of state vectors on all inputs to output;If this changes the output state vector, enqueue output edgeon worklist;od;17✬✫✩✪Running example:111313551x := 1STARTy:= x + 2y > xy := 5merge...y....control flow graph (CFG)state vectorson CFG edges18✬✫✩✪Algorithm can quite subtle:x := 1START...........px := x +1merge..x....First time through loop, use of x in loop is determined to beconstant 1. Next time through loop, it reaches final value .19✬✫✩✪Complexity of algorithm:Height of lattice = 2 => each state vector can change value 2 ∗ Vtimes.So while loop in algorithm is executed at most 2 ∗ E ∗ V times.Cost of each iteration: O(V ).So overall algorithm takes O(EV2) time.20✬✫✩✪Questions:• Can we use same work-list based algorithm with differentlattices to
View Full Document