Unformatted text preview:

1UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCEEmery BergerUniversity of Massachusetts, AmherstAdvanced CompilersCMPSCI 710Spring 2003Basic Loop OptimizationsUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE2Topics Last time Optimizations using SSA form Constant propagation & dead code elimination Loop invariant code motion This time Loop optimizations Induction variable Linear test replacement Loop unrolling Scalar replacementUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE3Easy Detectionof Loop Induction Variables Pattern match & check: Search for “i = i + b” in loop i is induction variable if no other assignment to i in loop Pros & Cons:+ Easy!- Does not catch all loop induction variablese.g., “i = a * c + 2”UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE4Taxonomy of Induction Variables basic induction variable: only definition in loop is assignmentj = j ± c, where c is loop invariant mutual induction variable: definition is linear function of other induction variable i‘: i = c1 * i‘± c2 i = i‘/ c1 ± c2 family of basic induction variable j: set of induction variables i such that i is always assigned linear function of jUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE5Strength Reduction Replace “expensive” op by “cheaper” one E.g., replace multiply by addition Apply to induction variable families Especially: array indexingUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE6Strength Reduction Algorithm Let i be induction variable in the family of basic induction variable j: i = c1 * j + c2 Create new variable i’ Initialize in pre-header: i’ = c1*j + c2 Track value of j: after j = j + c3, add i’ = i’ + (c1 * c3)2UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE7Strength Reduction ExampleUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE8Candidates for Strength Reduction Induction variable IV multiplied by invariant Recursively: IV * IV, IV mod constant, IV + IVUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE9Strength Reduction AlgorithmUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE10Strength Reduction Examples basic induction variable: only definition in loop is assignmentj = j ± c, where c is loop invariant mutual induction variable: definition is linear function of other induction variable i‘: i = c1 * i‘± c2 i = i‘/ c1 ± c2 family of basic induction variable j: set of induction variables i such that i is always assigned linear function of jUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE11Linear Test Replacement Eliminates induction variable! After strength reduction, loop test is often last use of induction variable Algorithm: If only use of IV is loop test and its own increment, and test is always computed i.e., only one exit from loop Replace test with equivalent one: E.g., “i comp k” ⇒ “i_50 comp k*50”UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE12Linear Test Replacement Example3UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE13Loop Unrolling To reduce loop overhead, we can unroll loops Advantages:+ Execute fewer total instructions+ More fodder for common subexpressionelimination, strength reduction, etc.+ Move consecutive access closer together Disadvantages:- Code bloat- Still updating through memoryUUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE14Scalar Replacement Problem: register allocators never keep a[i] in register Idea: trick allocator Locate patterns of consistent reuse Replace load with copy into temporary Replace store with copy from temporary May need copies at end of loop E.g., when reuse spans > 1 iteration Advantages: Decreases number of loads and stores Keeps reused values in registers Big performance impact (2x, 3x!)UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE15Scalar Replacement Example Scalar replacement exposes the reuse of a[i] Traditional scalar analysis – inadequate Use dependence analysis to understand array references (later)UUNIVERSITYNIVERSITYOFOFMMASSACHUSETTS, ASSACHUSETTS, AAMHERSTMHERST• • DDEPARTMENTEPARTMENTOF OF CCOMPUTER OMPUTER SSCIENCECIENCE16Next Time Common Subexpression Elimination Read ACDI: Ch. 12, pp. 343-355 Ch. 13, pp.


View Full Document

UMass Amherst CMPSCI 710 - Basic Loop Optimizations

Download Basic Loop Optimizations
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Basic Loop Optimizations and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Basic Loop Optimizations 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?