Unformatted text preview:

CSE 5317Lecture'18:'Op-miza-on1'Apr'2010Nate'NystromUniversi ty'of'Texas'at'ArlingtonFriday, April 2, 2010OptimizationCompiler'has'primary'responsib il ity'fo r'performanceOp-miza-ons◾set'of'transforma-ons'intended'to'improve'performance'of'the'gener ated'codeIssues:◾safety◾profitability◾risk2Friday, April 2, 2010SafetyMust'be'careful'to'preserve'meaning'of'the'program◾two'programs'are'equivalent'if'they'produce'iden-cal'resultsLarge'part'of'analysis'goes'toward'proving'op-miza-ons'are'safe3Friday, April 2, 2010Profitability“Op-miza-on”'is'a'misnomer“Code'improvement”'is'probab ly'a'beMer'termWill'a'given'op-miza-on'improve'performance?By'what'metric?◾-me,'space,'...OSen'not'clear:◾liMle'compara-ve'data'that'is'believable◾papers'give'conflic-ng'advice4Friday, April 2, 2010ProfitabilityDetermining'whether'an'op-miza-on'is'profitable'is'hard:◾linearV-me'heuris-cs'for'hard'problems◾unforeseen'consequences'and'poorly'understood'interac-ons◾“obvious'wins”'have'nonVobvious'downsides◾mul-ple'ways'to'achieve'same'end5Friday, April 2, 2010InliningReplace'func-on'call'with'body'of'the'func-on6Node%m%=%...;int%a%=%length(m);Node%m%=%...;Node%n%=%m;int%a%=%0;while%(n%!=%null)%{%%%%n%=%n.next;%%%%a++;}int%length(Node%n)%{%%%%int%len%=%0;%%%%while%(n%!=%null)%{%%%%%%%%n%=%n.next;%%%%%%%%len++;%%%%}%%%%return%len;}Friday, April 2, 2010Is inlining profitable?Holler:'almost'always'helpsHall:'occasional ly'helps,'but'has'lots'of'problemsMacFarland:'introduces'cache'missesThe'reality'is'somewhere'in'the'middle◾Waterman:'programVspecific'heuris-cs'win7Friday, April 2, 2010RiskTransforma-ons'can'hurt'ability'to'generate'good'code◾loop'unrol li ng,'inlining'can'increase'icache'misses◾unrollin g'introduces'redundant'computa-on◾redundancy'elimina-on'can'i ncrease'register'pressureOne'op-miza-on'can'enable'another,'or'can'undo'the'benefits'of'anotherOrd ering'of'o p-miza-ons'can'affect'performance8Friday, April 2, 2010Some optimizationsAc-va-on'record'mergingAdjacency'analysisAnchor'poin-ngCarry'op-miza-onCommon'subexpression'elimina-onConstant'foldingDead'code'elimina-onDead'space'reclama-onDetec-on'of'parallelismInser-ng'forced'copiesInstruc-on'scheduli ngLinear'func-on'test'replacementLive'range'shrinkingLoop'unrollingLook'fissionLoop'fusionLoop'unswitchingOperator'simplifica-onOperator'strength'reduc-onPeephole'op-miza-onProcedure'integra-onSpecial'case'code'genera-onRegister'alloca-onRegister'assignmentReassocia-onShadow'variable'introduc-onStack'height'reduc-onTest'elisionVariable'fol di ngZero'itera-on'test9This'list'is'c.'1982.''Even'more'op-miza-ons'now!Friday, April 2, 2010Classic taxonomyMachine'independent'transforma-ons◾deliberately'ignore'machineVspecific'constraints◾applicable'across'broad'range'of'machines◾decrease'ra-on'of'overhead'to'real'work◾reduce'running'-me'or'space◾example:'dead'code'elimina-onMachine'dependent'transforma-ons◾explici tly'consider'machineVspecific'constraints◾capitalize'on'machineVspecific'proper-es◾improve'mapping'from'IR'to'this%machine◾might'use'an'exo-c'i nstruc-on'(e. g.,'register'window'shiSing)◾example:'instruc-on'scheduling10Friday, April 2, 2010Classic taxonomyBut'dis-nc-on'is'not'always'clear◾e.g.,'replacing'mul-ple'with'shiSs'and'ad ds◾e.g.,'elimina-ng'a'redundant'expressionRedundancy'elimina-on'might'fit'into'either'category◾simple'machine'independent'versionvs.'versions'that'consider'register'pressure11Friday, April 2, 2010Better taxonomy (Cooper)EffectsVbased'classifica-on◾five'machineVindependent'ways'to'speed'up'code◾eliminate'a'redundant'computa-on◾move'code'to'a'place'where'it'executes'less'oSen◾eliminate'dead'code◾specialize'a'computa-on'based'on'context◾enable'another'transforma-on◾three'machineVdependent'ways'to'speed'up'code◾manage'or'hide'latency◾take'advantage'of'special'hardware'features◾manage'finite'resourcesThis'covers'most'scalar'op-miza-ons12Friday, April 2, 2010Machine independentredundancy◾redundancy'elimina-on'(CSE)◾par-al'redundancy'elimina-on'(PRE)◾consolida-oncode'mo-on◾loopVinvariant'co de'mo-on◾PRE◾consolida-on◾global'scheduling◾constant'propaga-ondead'code◾dead'code'elimina-on'(DCE)◾par-al'DCE◾constant'propaga-on◾algebraic'iden--esspecializa-on◾replica-on◾strength'reduc-on◾method'caching◾heapV>stack'alloca-on◾tail'recursion'elimina-oncreate'opportuni-es◾reassocia-on◾replica-on13Friday, April 2, 2010Machine dependenthide'latency◾scheduling◾blocking'references◾prefetching◾code'lay out◾data'packingmanage'resources◾alloca-on'(registers,'tlb'slots)◾scheduling◾data'packing◾coloring'memory'loca-onsspecial'features◾instruc-on'selec-on◾peephole'op-miza-on14Friday, April 2, 2010COMP 512, Fall 2008! 13!What about Fortran H? (Lecture 2) •! Eliminate a redundant computation >! Commoning •! Move code to a place where it executes less often >! Backward motion •! Eliminate dead code >! (must have done it, but don’t talk about it) •! Specialize a computation based on context >! Strength reduction •! Enable another transformation >! Reassociation •! Manage or hide latency •! Take advantage of special hardware features •! Manage finite resources >! Register allocation } Didn’t really talk about these, if they did them * COMP 512, Fall 2008! 14!Scope of Optimization (another axis) Local •! Handles individual basic blocks >! Maximal length sequence of straight line code >! Each of the bi’s is a basic block •! Basic blocks are easy to analyze •! Can prove strongest results •! Code quality suffers at block boundaries Local methods •! Value numbering, instruction scheduling b4 b5 b6 b1 b3 b2 * Local◾handles'individual'basic'blocks◾maximal'length'sequence'of'straight'line'code◾basic'blocks'are'easy'to'analyze'(no'branches)◾can'prove'strongest'results◾code'quality'suff ers'at'block'boundariesLocal'methods◾value'nu mbering◾instruc-on'scheduling◾peephole'op-miza-onScope of optimization15Friday, April 2, 2010COMP 512, Fall 2008! 15!Superlocal •! Handles extended basic blocks >! Sequence of blocks where each has a unique predecessor >! Use results for bi to help with bj •! Analysis & transformation over larger region •! Fewer rough edges •! Can make it efficient by reusing results Superlocal methods •! Value numbering, instruction scheduling Scope of Optimization b4 b5 b6 b1 b3 b2 * COMP 512, Fall 2008!


View Full Document

UT Arlington CSE 5317 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?