DOC PREVIEW
Progressive Register Allocation for Irregular Architectures

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Progressive Register Allocation for Irregular ArchitecturesIrregular ArchitecturesFewer Registers  More SpillsRegister Usage RestrictionsMemory OperandsRegister Allocation ChallengesPrevious WorkOur GoalsMulticommodity Network Flow (MCNF)Modeling Usage ConstraintsModeling Spills and MovesModeling StoresRegister Allocation as MCNFSolving an MCNFLagrangian Relaxation: IntuitionSolution ProcedureSlide 17EvaluationBehavior of SolverProven OptimalityComprehensive ResultsProgressive NatureContributions2005 International Symposium on Code Generation and OptimizationProgressive Register Allocation for Irregular ArchitecturesDavid [email protected] Copen [email protected] 23, 20052005 International Symposium on Code Generation and Optimization2Irregular Architectures•Few registers•Register usage restrictions–address registers, hardwired registers...•Memory operands•Examples:–x86, 68k, ColdFire, ARM Thumb, MIPS16, V800, various DSPs...eaxebxecxedxesiediebpesp2005 International Symposium on Code Generation and Optimization3Fewer Registers  More Spills•Used gcc to compile >10,000 functions from Mediabench, Spec95, Spec2000, and micro-benchmarks•Recorded which functions spilledPercent of functions that spill05101520253035404550PPC (32) 68k (16) x86 (8)Percent2005 International Symposium on Code Generation and Optimization4Register Usage Restrictions•Instructions may prefer or require a specific subset of registers–x86 multiply instructionimul %edx,%eax // 2 byte instructionimul %edx,%ecx // 3 byte instruction–x86 divide instructionidivl %ecx // eax = edx:eax/ecx2005 International Symposium on Code Generation and Optimization5Memory Operands•Load/store not always needed to access variables allocated to memory–depends upon instruction–still less efficient than register accessaddl 8(%ebp), %eax vsmovl 8(%ebp), %edxaddl %edx, %eax2005 International Symposium on Code Generation and Optimization6Register Allocation Challenges•Optimize spill code–with few registers, spilling unavoidable•Model register usage restrictions•Exploit memory operands–affects spilling decisions2005 International Symposium on Code Generation and Optimization7Previous WorkMethod Models Irregular FeaturesFast OptimalGraph ColoringInteger Programming[Goodwin and Wilken 96][Kong and Wilken 98][Fu and Wilken 2002]Separated IP[Appel and George 01]PBQP[Scholz and Eckstein 02]/ /2005 International Symposium on Code Generation and Optimization8Our Goals•Expressive–Explicitly represent architectural irregularities and costs•Proper model–An optimum solution results in optimal register allocation•Progressive solution algorithm–more computation  better solution–decent feasible solution obtained quickly–competitive with current allocators2005 International Symposium on Code Generation and Optimization9Multicommodity Network Flow (MCNF)a ba b2224444instructioncrossbarsourcesink2005 International Symposium on Code Generation and Optimization10Modeling Usage Constraintsint foo(int a, int b, int c){ a = a*b; return a/c;}aabimuleax edx ecx memb1-1idiveax edx ecx memcc1not quite right…2005 International Symposium on Code Generation and Optimization11Modeling Spills and Movesint foo(int a, int b, int c){ a = a*b; return a/c;}aimuleax edx ecx memb1-1eax edx ecx memeax edx ecx memcb3 3 3aidiveax edx ecx memc1eax edx ecx memeax edx ecx mem2005 International Symposium on Code Generation and Optimization12Modeling Stores•Simple approach flawed–doesn’t model memory persistency•Solution: antivariables–flow only through memory–eviction cost = store cost–evict only once2005 International Symposium on Code Generation and Optimization13Register Allocation as MCNF•Variables  Commodities•Variable Usage  Network Design•Nodes  Allocation Classes (Reg/Mem)•Registers Limits  Node Capacities•Spill Costs  Edge Costs•Variable Definition  Source•Variable Last Use  Sink2005 International Symposium on Code Generation and Optimization14Solving an MCNF•Integer solution NP-complete•Use standard IP solvers–commercial solvers (CPLEX) are impressive•Exploit structure of problem–variety of MCNF specific solvers•empirically faster than IP solvers•Lagrangian Relaxation technique2005 International Symposium on Code Generation and Optimization15Lagrangian Relaxation: Intuition•Relaxes the hard constraints –only have to solve single commodity flow•Combines easy subproblems using a Lagrangian multiplier–an additional price on each edgeabab01Example:edges have unit capacityabab0+11with price, solution to single commodity flow can be solution to multicommodity flow2005 International Symposium on Code Generation and Optimization16Solution Procedure•Compute prices using iterative subgradient optimization–converge to optimal prices•At each iteration, greedily construct a feasible solution using current prices–allocate most expensive vars first–can always find an allocation2005 International Symposium on Code Generation and Optimization17Solution Procedure•Advantages+have feasible solution at each step+iterative nature  progressive+Lagrangian relaxation theory provides means for computing a lower bound+Can compute optimality bound•Disadvantages–No guarantee of optimality of solution2005 International Symposium on Code Generation and Optimization18Evaluation•Replace gcc’s local allocator•Optimize for code size–easy to statically evaluate•Evaluate on MediaBench, MiBench, SpecInt95, SpecInt2000–consider only blocks where local allocation is interesting (enough variables to spill)2005 International Symposium on Code Generation and Optimization19Behavior of Solver2005 International Symposium on Code Generation and Optimization20Proven Optimality0%10%20%30%40%50%60%70%80%90%100%1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters5-10conflicts (355 blocks)10-15conflicts(23 blocks)15-20conflicts(7 blocks)>= 20conflicts(5 blocks)>25%Within 20%Within 15%Within 10%Within 5%Optimal2005 International Symposium on Code Generation and Optimization21Comprehensive Results-15.00%-10.00%-5.00%0.00%5.00%10.00%15.00%20.00%1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters1 Iter10 Iters100 Iters1000 Iters5-10 conflicts(355 blocks)10-15 conflicts(23 blocks)15-20 conflicts(7 blocks)>= 20


Progressive Register Allocation for Irregular Architectures

Download Progressive Register Allocation for Irregular Architectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Progressive Register Allocation for Irregular Architectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Progressive Register Allocation for Irregular Architectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?