1Progressive RegisterAllocation for IrregularArchitecturesDavid Koes2/1/2005Outline• Register Allocation forIrregular Architectures• Previous Work• Progressive Register AllocationIrregular Architectures• Few registers• Register usage restrictions– address registers, hardwired registers...• Memory operands• Examples:– x86, 68k, ColdFire,ARM Thumb, MIPS16,V800, various DSPs...eaxebxecxedxesiediebpespIrregular Architectures Register Allocation• Graph coloring register allocation used– gcc, ORC, SUIF, GHS, IMPACT, IBM• Assertion:– Graph coloring is the wrong representationfor performing register allocation onirregular architectures2Fewer Registers → More Spills• Used gcc to compile>10,000 functionsfrom Mediabench,Spec95, Spec2000,and micro-benchmarks• Recorded for whichfunctions graphcoloring failedPPC (32 registers)68k (16 registers)x86 (8 registers)3Graph Coloring and Spills• Graph coloring solves the registersufficiency problem– Even if P=NP, suboptimal if spills necessary• No optimization of spill code• Many spills may slow down allocator– has to rebuild interference graphRegister Usage RestrictionsExample68kMOVE (ptr),tmp1EOR #3,tmp1MOVEQ #32,tmp2or MOVEA #32,tmp2ADD tmp2,ptrMOVE tmp1,D0address registerdata registerany registerRegister Usage Restrictions• Instructions may require or prefer a specific subset ofregisters– 68k address/data registers• MOVEA #1,A0 // 4 byte instruction• MOVEQ #1,D0 // 2 byte instruction– x86 div instruction• Graph coloring assumes all colors are equallyapplicable– no principled way to express preferences– requirements may be mutually exclusiveMemory Operands• A variable allocated to memory may notrequire load/store to access– depends upon instruction– still less efficient than register access• Graph coloring (usually) spills variableswhich make graph easier to color– may not be an efficient variable to spill4Graph Coloring Wrong Approach forIrregular Architectures• Solves wrong problem– focuses attention on preventing spills– doesn’t optimize spill code• No representation of irregular features• Variables assigned single register– complicates meeting usage restrictions– live range splitting partial solutionOutline• Register Allocationfor Irregular Architectures• Previous Work– Graph coloring improvements– Integer Programming– Separated IP– PBQP• Progressive Register AllocationGraph Coloring Improvements• Spill code optimization– better heuristics [Bernstein et al 89]– partial spilling [Bergner et al 97]• Register usage constraints– modified interference graph [Briggs 92]– weighted interference graph/modifiedheuristics [Smith and Holloway 01]Outline• Register Allocation Overview• ...for Irregular Architectures• Previous Work– Graph coloring improvements– Integer Programming– Separated IP– PBQP• A New Hope5Integer Programming (IP)• Minimize/maximize linear function• Subject to linear constraints• Solution must be integer• Example! Maximize z = x1+ x2subject to2x1+ 3x2" 12,x1" 4, x2" 351,4:Solution21===zxxRegister Allocation as IP• Simplified example1,,01subject to233min!!"++++#cbacbacbammmmmmmmma <-b <-c <- a + b <- a + c <- bmvar is a decision variable0 means var is in register1 means var is in memoryIP: Good News• IP can precisely model registerallocation [Goodwin and Wilken 96]– including irregular architecture features[Kong and Wilken 98]– can exploit structure of register allocationproblem to improve compile time[Fu and Wilken 2002]• Can solve problem without integerconditions in polynomial timeIP: Bad News• With integer conditions problem isNP-complete• No polynomial guarantee• Does not get feasible solution quickly– can’t just impose time limits and get ausable, if suboptimal, solution6IP: Results• SPEC92 (integer)• x86, models many irregular features• 61% reduction in runtime spill codeoverhead• >15 minutes on 2.4% of SPEC92functionsT. Kong and K. Wilken, “Precis e Regis ter Allocation for Irregular Architectures ”1998Outline• Register Allocation Overview• ...for Irregular Architectures• Previous Work– Graph coloring improvements– Integer Programming– Separated IP– PBQP• A New HopeSeparated IP• Separate allocation and assignment[Appel and George 01]• Use IP to optimally insert spill code– also model some x86 features• Result never has more than k livevariables at any point– not necessarily k-colorable– insert moves at every program pointSeparated IP: Second Pass• Second pass performs assignment andremoves moves– use heuristic solution [Park and Moon 98]– optimal solution (IP) not tractable– left as an open problem in paper7Separated IP: ResultsOverall 9.5% improvement in execution speedA. W. Appel, L. George. “Optimal spilling for CISC machines with few registers.”Separated IP: Limitations• Can still be prone to exponential blow-upin first pass– may not provide intermediate solution• Second pass not optimal• Claims to be faster than full IP solution– different compilers, benchmarks, sourcelanguages, and target architecturesOutline• Register AllocationIrregular Architectures• Previous Work– Graph coloring improvements– Integer Programming– Separated IP– PBQP• Progressive Register AllocationPartitioned Boolen QuadraticOptimization Problem Formulation• Similar to IP [Scholz and Eckstein 02]– minimize quadratic function– decision variables 0-1– constraints incorporated into function8• Advantages– Can fully model irregular features– Fast, polynomial approximation performswell in practice• Disadvantages– Approximation algorithm not bounded– No iterative way to improve upon solutionPartitioned Boolen QuadraticOptimization Problem FormulationPBQP: Results• Caramel 20xx DSP– very irregular register requirements• Geometric mean improvement– Optimal: 5.85%– Approximation: 3.93%ComparisonOptimalPolynomialrunning timeModelsirregularfeaturesOptimizesspill codeMethodno/yesyes/noyesyesPBQPnonoyesyesSeparated IPyesnoyesyesIntegerProgrammingnoyessome, withheuristicswithheuristicsGraphColoringOutline• Register AllocationIrregular Architectures• Previous Work• Progressive Register Allocation– MCNF Formulation– Solution Procedure– Results9Problem Formulation Goals• Explicitly represent architecturalirregularities and costs• An optimum solution results in optimalregister allocation•
View Full Document