Unformatted text preview:

PRE and Loop Invariant Code Motion15-745 Spring 20072Common Subexpression EliminationFind computations that are always performed at least twice on an execution path and eliminate all but the firstUsually limited to algebraic expressions• put in some cannonical formAlmost always improves performance• except when?3CSE LimitationSearches for only “totally” redundant expressions • An expression is totally redundant if it is recomputed along all paths leading to the redundant expression• An expression is partially redundant if it is recomputed along some but not all paths:= x+y := x+y:= x+y:= x+y:= x+yfully redundantpartially redundant4Loop-Invariant Code Motion Moves computations that produce the same value on every iteration of a loop outside of the loopWhen is a statement loop invariant?• when all its operands are loop invariant...5Loop InvarianceAn operand is loop-invariant if1.it is a constant,2.all defs (use ud-chain) are located outside the loop, or3.has a single def (ud-chain again) which is inside the loop and that def is itself loop invariantCan use iterative algorithm to compute loop invariant statements6Aside: The loop pre-headerSeveral optimizations require us to move statements “before the header of the loop”.To do this we create a new block, called the pre-header, which has the loop header as successor.All edges which formerly entered the header of L from outside of L instead enter the pre-header.Loop LheaderLoop Lheaderpre-header7Loop Invariant Code MotionNaïve approach: move all loop-invariant statements to the preheaderNot always valid for statements which define variablesIf statement s defines v, can only move s if• s dominates all uses of v in the loop• s dominates all loop exitsWhy?8Loop Invariant Code MotionLoop invariant expressions are a form of partially redundant expressions. Why?x ← y * za ← b * cx ← y * za ← b * ca ← b * c*9Partial Redundancy EliminationMoves computations that are at least partially redundant to their optimal computation points and eliminates totally redundant onesEncompasses CSE and loop-invariant code motiona := x+ya := x+y10Optimal Computation PointOptimal?• Result used and never recalculated• Expression placed late as possible Why?a := x+ya := x+y11PRE ExampleentryB1z = a +1x > 3B2a = x * yy < 5B3z < 7B4b = x * yB5 B6B7c = x * yexitWhat expression is partially redundant?What are the optimal computation points?12Critical Edge SplittingIn order for PRE to work well, we must split critical edgesA critical edge is an edge that connects a block with multiple successors to a block with multiple predecessorsB2a = x * yy < 5B3z < 7B4b = x * yB5 B613Critical Edge SplittingIn order for PRE to work well, we must split critical edgesA critical edge is an edge that connects a block with multiple successors to a block with multiple predecessorsB2a = x * yy < 5B3z < 7B4b = x * yB5 B6B2a B3a14PRE HistoryPRE was first formulated as a bidirectional data flow analysis by Morel and Renvoise in 1979Knoop, Rüthing, and Steffen came up with a way to do it using several unidirectional analysis in 1992 (called their approach lazy code motion)• this is a much simpler method• but it is still very complicated15PRE ExampleentryB1z = a +1x > 3B2a = x * yy < 5B3z < 7B4b = x * yB5B6B7c = x * yexitB2aB3aMust compute several data flow properties in order to find optimal placement for partially redundant expressions16Local Transparency (TRANSloc)An expression’s value is locally transparent in a block if there are no assignments in the block to variables within the expression• ie, expression not killedTRANSlocBlock{a+1,x*y}exit{a+1,x*y}B7{a+1,x*y}B6{a+1,x*y}B5{a+1,x*y}B4{a+1,x*y}B3a{a+1,x*y}B3{a+1,x*y}B2a{x*y}B2{a+1,x*y}B1{a+1,x*y}entry17Local Anticipatable (ANTloc)An expression’s value is locally anticipatable in a block if• there is a computation of the expression in the block• the computation can be safely moved to the beginning of the blockANTlocBlock{}exit{x*y}B7{}B6{}B5{x*y}B4{}B3a{}B3{}B2a{x*y}B2{a+1}B1{}entry18Globally Anticipatable (ANT)An expression’s value is globally anticipatable on entry to a block if• every path from this point includes a computation of the expression• it would be valid to place a computation of an expression anywhere along these pathsThis is like liveness, only for expressions19 ANTin(i)=ANTloc(i)∪ TRANSloc(i)∩ ANTout(i)()ANTout(i) = ANTin( j)j∈succ (i)IANTout(exit) = {}Globally Anticipatable (ANT){}{}{}{x*y}{}{x*y}{}{x*y}{x*y}{}{a+1}ANToutANTinBlock{}exit{x*y}B7{}B6{x*y}B5{x*y}B4{x*y}B3a{}B3{x*y}B2a{x*y}B2{a+1}B1{a+1}entry20Earliest (EARL)An expression’s value is earliest on entry to a block if• no path from entry to the block evaluates the expression to produce the same value as evaluating it at the block’s entry wouldIntuition:at this point if we compute the expression we are computing something completely newsays nothing about usefulness of computing expression21 EARLout(i) = TRANSloc(i)∪ ANTin(i)∩ EARLin(i)()EARLin(i) = EARLout( j)j∈pred(i)UEARLin(entry) = UEarliest (EARL){a+1,x*y}{a+1}{x*y}{a+1}{a+1}{}{x*y}{a+1}{a+1}{x*y}{x*y}EARLoutEARLinBlock{a+1,x*y}exit{a+1}B7{x*y}B6{a+1}B5{a+1}B4{x*y}B3a{x*y}B3{a+1}B2a{x*y}B2{x*y}B1{a+1,x*y}entry22Delayedness (DELAY)An expression is delayed on entry to a block if• it is both anticipatable and earliest23 DELAYin(i) = ANTin(i)∩ EARLin(i)()∪ DELAYout(j)j∈pred(i)IDELAYout(i) = ANTloc(i)∩ DELAYin(i)DELAYin(entry) =ANTin (entry)∩ EARLin(entry)Delayedness (DELAY){}{}{}{}{}{x*y}{}{}{}{}{a+1}DELAYoutDELAYinBlock{}exit{}B7{}B6{}B5{}B4{x*y}B3a{}B3{}B2a{x*y}B2{a+1}B1{a+1}entry{x*y}B3a{x*y}B2{a+1}entryANTin(i) ∩ EARLin(i)Block24Lateness (LATE)An expression is latest on entry to a block if• it is the optimal point for computing the expression and• on every path from the block entry to exit, any other optimal computation point occurs after an expression computation in the original flowgraphi.e., there is no “later” placement for this expression25 LATEin(i) = DELAYin(i)∩ ANTloc(i)∪ DELAYin( j)j∈succ(i)I⎛ ⎝ ⎜ ⎞ ⎠ ⎟ Latestness (LATE)LATEinBlock{}exit{}B7{}B6{}B5{}B4{x*y}B3a{}B3{}B2a{x*y}B2{a+1}B1{}entry26Isolatedness (ISOL)An optimal placement in a block for the computation of an expression is isolated iff• on every path from a successor of the block to the exit block, every original computation is preceded by the optimal placement point27 ISOLin(i) = LATEin(i)∪ ANTloc(i)∩


View Full Document

CMU CS 15745 - Lecture

Documents in this Course
Lecture

Lecture

14 pages

Lecture

Lecture

19 pages

Lecture

Lecture

8 pages

Lecture

Lecture

5 pages

Lecture

Lecture

6 pages

lecture

lecture

17 pages

Lecture 3

Lecture 3

12 pages

Lecture

Lecture

17 pages

Lecture

Lecture

18 pages

lecture

lecture

14 pages

lecture

lecture

8 pages

lecture

lecture

5 pages

Lecture

Lecture

19 pages

lecture

lecture

10 pages

Lecture

Lecture

20 pages

Lecture

Lecture

7 pages

lecture

lecture

59 pages

Lecture

Lecture

10 pages

Task 2

Task 2

2 pages

Handout

Handout

18 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?