Unformatted text preview:

Terminology, Principles, and Concerns, IV COMP 512 Rice University Houston, Texas Fall 2008 Copyright 2008, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Lab 0 is available on the web site, due 1/26/2009. With examples from LIVE and global block positioning Last Lecture •! Computing dominance information >! Quick introduction to global data-flow analysis !! That is compile-time reasoning about the runtime flow of values !! Round-robin iterative algorithm to find MOP solution to DOM •! Using immediate dominators to improve on SVN >! For a node n, start LVN with the hash table from IDOM(n) !! Includes results from each predecessor in dominator tree >! Use scoped hash table and SSA names to simplify algorithm >! Some predecessor information at each node in CFG COMP 512, Fall 2008! 2!This Lecture Examples of Global Analysis and Transformation •! Computing live variables >! Classic backwards global data-flow problem >! Used in SSA construction, in register allocation •! Using live information to eliminate useless stores >! Simple demonstration of the use of LIVE •! Single-procedure block placement algorithm (Pettis & Hansen) >! Arrange the blocks to maximize fall-through branches >! Improves code locality as a natural consequence COMP 512, Fall 2008! 3!Computing Live Information A value v is live at p if " a path from p to some use of v along which v is not re-defined Data-flow problems are expressed as simultaneous equations Annotate each block with sets LIVEOUT and LIVEIN LIVEOUT(b) = #s!succ(b) LIVEIN(s) LIVEIN(b) = UEVAR(b) # (LIVEOUT(b) $ VARKILL(b)) LIVEOUT(nf) = % where UEVAR(b) is the set of names used in block b before being defined in b VARKILL(b) is the set of names defined in b § 9.2.1 in EaC Domain of LIVEOUT is variables COMP 512, Fall 2003! 4!Note that LIVE is a backwards data-flow problemComputing Live Information The compiler can solve these equations with a simple algorithm The world’s quickest introduction to data-flow analysis ! WorkList & { all blocks } while ( WorkList ! Ø) remove a block b from WorkList Compute LIVEOUT(b) Compute LIVEIN(b) if LIVEIN(b) changed then add pred (b) to WorkList The Worklist Iterative Algorithm Why does this work? •! LIVEOUT, LIVEIN ' 2Names •! UEVAR & VARKILL are constants for b •! Equations are monotone •! Finite # of additions to sets (! will reach a fixed point ! Speed of convergence depends on the order in which blocks are “removed” & their sets recomputed COMP 512, Fall 2008! 5!The worklist should be implemented as a set so that it does not contain duplicate entries. COMP 512, Fall 2003! 6!Using Live Information: Eliminating Unneeded Stores Transformation: Eliminating unneeded stores •! Value in a register, have seen last definition, never again used •! The store is dead (except for debugging) •! Compiler can eliminate the store The Plan: •! Solve for LIVEIN and LIVEOUT •! Walk through each block, bottom to top >! Compute local LIVE incrementally >! If target of STORE operation is not in LIVE, delete the STORE •! If all STOREs to a local variable are eliminated, can delete the space for it from the activation record |LIVEOUT| = |variables|Using LIVE Information: Eliminating Unneeded Stores Safety •! If x ! LIVE(s) at some STORE s, its value is not used along any path from s to the exit node of the CFG >! Its value is not read and is, therefore, dead •! Relies on the correctness of LIVE Profitability •! Assumes that not executing a STORE costs less than executing it Opportunity •! Linear search, block-by-block, for STORE operations >! Could build a list of them while computing initial UEVAR set COMP 512, Fall 2008! 7!Block Placement The order of blocks in memory matters •! Bad placement can increase working set size (TLB & page misses) •! Fall-through and branch-taken paths differ in cost & locality The plan •! Discover which paths execute frequently •! Rearrange blocks to keep those paths in contiguous memory Finding hot paths •! Need execution profile information COMP 512, Fall 2008! 8!COMP 512, Fall 2006 ! 9!Block Placement Targets branches with unequal execution frequencies •! Make likely case the “fall through” case •! Move unlikely case out-of-line & out-of-sight Potential benefits •! Longer branch-free code sequences •! More executed operations per cache line •! Denser instruction stream ( fewer cache misses •! Moving unlikely code ( denser page use & fewer page faults COMP 512, Fall 2006 ! 10!Block Placement Moving infrequently executed code B1 B4 B2 B3 B1 B4 B2 B3 1000 1000 1 1 Unlikely path gets fall through (cheap) case Likely path gets an extra branch Would like this to become B1 B4 B3 B2 Long distance Long distance In another page, ... This branch goes away Denser instruction stream *COMP 512, Fall 2006 ! 11!Block Placement Overview 1.! Build chains of frequently executed paths >! Work from profile data >! Edge profiles are better than node profiles >! Combine blocks with a simple greedy algorithm 2.! Lay out the code so that chains follow short forward branches Gathering profile data •! Instrument the executable •! Statistical sampling •! Infer edge counts from performance count data While precision is desirable, a good approximation will probably work well. COMP 512, Fall 2006 ! 12!Block Placement The Idea •! Form chains that should be placed to form straight-line code First step: Build the chains 1. Make each block a degenerate chain & set its priority to # blocks 2. P & 1 3. ) edge e = <x,y> in the CFG, in order by decreasing frequency if x is the tail of a chain, a, and y is the head of a chain, b then merge a and b else set priority(y) to min(priority(y),P ++) Point is to place targets after their sources, to make forward branches PA-RISC predicted most forward branches as taken, backward as not takenCOMP 512, Fall 2006 ! 13!Block Placement Second step: Lay out the code WorkList & chain containing the entry node, n0 While (WorkList ! Ø) Pick the chain c with lowest priority(c) from WorkList Place it next in the code ) edge <c,z > leaving c add z to WorkList Intuitions •! Entry node first •! Tries to make edge from chain i to chain j a forward branch >! Predicted as taken on target machine >! Edge remains only if it


View Full Document

Rice COMP 512 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?