Carnegie Mellon Lecture 2 Introduction to Data Flow Analysis I. Introduction II. Example: Reaching definition analysis III. Example: Liveness analysis IV. A General Framework (Theory in next lecture) Reading: Chapter 9.2 M. Lam CS243: Intro to Data Flow 1Carnegie Mellon I. Compiler Organization M. Lam CS243: Intro to Data Flow 2Carnegie Mellon Flow Graph • Basic block = a maximal sequence of consecutive instructions s.t. – flow of control only enters at the beginning – flow of control can only leave at the end (no halting or branching except perhaps at end of block) • Flow Graphs – Nodes: basic blocks – Edges • Bi --> Bj, iff Bj can follow Bi immediately in execution M. Lam CS243: Intro to Data Flow 3Carnegie Mellon What is Data Flow Analysis? • Data flow analysis: – Flow-sensitive: sensitive to the control flow in a function – intraprocedural analysis • Examples of optimizations: – Constant propagation – Common subexpression elimination – Dead code elimination M. Lam CS243: Intro to Data Flow 4 Value of x? Which “definition” defines x? Is the definition still meaningful (live)? e = b + c a = 243 e = d+3 g = a a = b + c d = 7Carnegie Mellon Static Program vs. Dynamic Execution • Statically: Finite program • Dynamically: Can have infinitely many possible execution paths • Data flow analysis abstraction: – For each point in the program: combines information of all the instances of the same program point. • Example of a data flow question: – Which definition defines the value used in statement “b = a”? M. Lam CS243: Intro to Data Flow 5Carnegie Mellon Reaching Definitions • Every assignment is a definition • A definition d reaches a point p if there exists path from the point immediately following d to p such that d is not killed (overwritten) along that path. • Problem statement – For each point in the program, determine if each definition in the program reaches the point – A bit vector per program point, vector-length = #defs M. Lam CS243: Intro to Data Flow 6 d3: x = 1 d4: y = 2 d5: z = x d6: x = 4 d0: y = 3 d1: x = 10 d2: y = 11 if e B0 B1 B2Carnegie Mellon Data Flow Analysis Schema • Build a flow graph (nodes = basic blocks, edges = control flow) • Set up a set of equations between in[b] and out[b] for all basic blocks b – Effect of code in basic block: • Transfer function fb relates in[b] and out[b], for same b – Effect of flow of control: • relates out[b1], in[b2] if b1 and b2 are adjacent • Find a solution to the equations M. Lam CS243: Intro to Data Flow 7Carnegie Mellon Effects of a Statement • fs : A transfer function of a statement – abstracts the execution with respect to the problem of interest • For a statement s (d: x = y + z) out[s] = fs(in[s]) = Gen[s] U (in[s]-Kill[s]) – Gen[s]: definitions generated: Gen[s] = {d} – Propagated definitions: in[s] - Kill[s], where Kill[s]=set of all other defs to x in the rest of program M. Lam CS243: Intro to Data Flow 8 d1: x = 10 d0: y = 3 in[B0] d2: y = 11 out[B0] fd0 fd1 fd2Carnegie Mellon Effects of a Basic Block • Transfer function of a statement s: • out[s] = fs(in[s]) = Gen[s] U (in[s]-Kill[s]) • Transfer function of a basic block B: • Composition of transfer functions of statements in B • out[B] = fB(in[B]) = fd1fd0(in[B]) = Gen[d1] U (Gen[d0] U (in[B]-Kill[d0]))-Kill[d1]) = (Gen[d1] U (Gen[d0] - Kill[d1])) U in[B] - (Kill[d0] U Kill[d1]) = Gen[B] U (in[B] - Kill[B]) Gen[B]: locally exposed definitions (available at end of bb) Kill[B]: set of definitions killed by B M. Lam CS243: Intro to Data Flow 9 d1: x = 10 d0: y = 3 in[B0] out[B0] fd0 fd1 fB = fd1⋅fd0Carnegie Mellon Effects of the Edges (acyclic) • Join node: a node with multiple predecessors • meet operator (∧): U in[b] = out[p1] U out[p2] U ... U out[pn], where p1, ..., pn are all predecessors of b M. Lam CS243: Intro to Data Flow 10Carnegie Mellon Cyclic Graphs • Equations still hold • out[b] = fb(in[b]) • in[b] = out[p1] U out[p2] U ... U out[pn], p1, ..., pn pred. • Find: fixed point solution M. Lam CS243: Intro to Data Flow 11Carnegie Mellon Reaching Definitions: Iterative Algorithm input: control flow graph CFG = (N, E, Entry, Exit) // Boundary condition out[Entry] = ∅ // Initialization for iterative algorithm For each basic block B other than Entry out[B] = ∅ // iterate While (Changes to any out[] occur) { For each basic block B other than Entry { in[B] = ∪ (out[p]), for all predecessors p of B out[B] = fB(in[B]) // out[B]=gen[B]∪(in[B]-kill[B]) } M. Lam CS243: Intro to Data Flow 12Carnegie Mellon Summary of Reaching Definitions Reaching Definitions Domain Sets of definitions Transfer function fb(x) forward: out[b] = fb(in[b]) fb(x) = Genb ∪ (x -Killb) Genb: definitions in b Killb: killed defs Meet Operation in[b]= ∪ out[predecessors] Boundary Condition out[entry] = ∅ Initial interior points out[b] = ∅ M. Lam CS243: Intro to Data Flow 13Carnegie Mellon III. Live Variable Analysis • Definition – A variable v is live at point p if • the value of v is used along some path in the flow graph starting at p. – Otherwise, the variable is dead. • Problem statement – For each basic block • determine if each variable is live in each basic block – Size of bit vector: one bit for each variable M. Lam CS243: Intro to Data Flow 14Carnegie Mellon Effects of a Basic Block (Transfer Function) • Observation: Trace uses back to the definitions • Direction: backward: in[b] = fb(out[b]) • Transfer function for statement s: x = y + z • generate live variables: Use[s] = {y, z} • propagate live variables: out[s] - Def[s], Def[s] = x • in[s] = Use[s] ∪ (out(s)-Def[s]) • Transfer function for basic block b: • in[b] = Use[b] ∪ (out(b)-Def[b]) • Use[b], set of locally exposed uses in b, uses not covered by definitions in b • Def[b]= set of variables defined in b. M. Lam CS243: Intro to Data Flow 15Carnegie Mellon Across Basic Blocks • Meet operator (∧): – out[b] = in[s1] ∪ in[s2] ∪ ... ∪ in[sn], s1, ..., sn are successors
View Full Document