New version page

UT Arlington CSE 5317 - Lecture 11 - IR

Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

CSE 5317Lecture'11:'IR23'Feb'2010Nate'NystromUniversity'of'Texas'at'ArlingtonWhere are we now?2front'endsource'codetarget'codeerrorsback'endIRmiddle'endIRIRWhere are we now?3front'endsource'codetarget'codeerrorsback'endFront'end:◾produce'intermediate'representaDon'(IR)'o f'the'p rogramMiddle'end:◾transform'IR'into 'equi valent,'more'efficient'programBack'end:◾transform'IR'into 'target'codeIRmiddle'endIRIRWhy IR?Why'use'an'intermediate'representaDon?◾breaks'compiler'into'manageable'pieces◾good'soOware'engineering◾allows'compiler'to'make'mulDple'passes'over'the'program◾supports'mulDple'languages'and'mulDple'front'and'back'ends◾enables'machineQindependent'opDmizaDon◾general'techniques,'mulDple'passes4Properties of IRImportant'properDes'of'IR◾ease,'cost'of'generaDon◾ease,'cost'of'manipulaD on◾level'of'abstracDon◾freedom'of'expression◾size'of'typical'procedureSubtle'design'decisions'in'IR'have'farQreaching'effects'on'speed'and'effecDveness'of'the'compilerLevel'of'exposed'detail'is'crucial'consideraDon5Multiple IRsCompiler'oOen'supports'mulDple'IRsUsually'two'levels◾highQlevel' IR'(lan guage'independent,'but'closer'to'language)◾lowQlevel'IR'(machine'independent,'but'closer'to'machine)The'project:'3'levels◾ASTs,'IR'trees'(Piglet),'3Qaddress'code'(Kanga)6Why multiple IRs?Different'IRs'are'beZer'for'different'analyses'and'transformaDonsHighQlevel'IR◾good'for'method'inlining'and'specializaDon◾good'for'loop'opDmiz aDonsLowQlevel'IR◾good'for'lowQlevel'opDmizaDons:'code'moDon ,'strength'reducDon,'common'subexpression'eliminaDon7Why does the choice of IR matter?Could'do'lowQlevel'opDmizaDons'on'highQlevel'IR,'but'tran slaDon'to'lowQlevel'IR'i ntroduces'more'opportuniDes'to'opDmize.Example:◾a[i+1]'+'b[i+1]''''QQ>'(common'sub expression'eliminaDon)t'='i+1a[t]'+'b[t]''''QQ>'(translaDon'to'low'IR)t'='i+1*(a'+'t*4)'+'*(b'+'t*4)''''QQ>'(common'sub expression'eliminaDon)t'='i+1u'='t*4*(a'+'u)'+'*(b'+'u)8Choices of IRRepresentaDons'talked'about'in'the'literatu re:◾abstract'syntax'trees'(AST)◾linear'(operator)'form'of'tree◾directed'acyclic'graphs'(DAG)◾control'flow'graphs'(CFG)◾program'dependence'graphs'(PDG)◾staDc'single'assignment'form'(SSA)◾stack'code◾threeQaddress'code◾assembly'code◾hybrids9CategoriesBroadly,'IRs'fall'into'three'categories◾Structural◾graphically'oriented◾examples:'trees,'directed'acyclic'graphs◾heavily'used'i n'so urceQtoQsource'translators◾nodes,'edges'tend'to'be'large◾Linear◾pseudo'code'for'some'abstract'machine◾large'variaDon'in'level'of'abstracDon◾simple,'compact'data'structures◾easier'to'rearrange◾example:'threeQaddress'code◾Hybrids◾combinaDon'of'graphs'and'linear'code◾aZempt'to'have'“best'of'both'worlds”◾example:'controlQflow'grap hs10Abstract syntax treesRepresent'program'as'trees'with'node'for'each'syntacDc'constructEx:'x'Q'2'*'yPreserves'much'of'the'program'structureNot'language'independentCan'be'difficult'to'manipu late11Intermediate representationssourcecodefrontendopt.backendtargetcodeir irfront end produces an intermediate representation (IR) for the program.optimizer transforms the code in IR form into an equivalent program thatmay run more efficiently.back end transforms the code in IR form into native code for the targetmachineThe IR encodes knowledge that the compiler has derived about the sourceprogram.CMSC 430 Lecture 8, Page 1Intermediate representationsAdvantages• compiler can make multiple passes over program• break the compiler into manageable pieces• support multiple languages and architectures using multiple front & backends• enables machine-independent optimizationDesirable properties• easy & inexpensive to generate and manipulate• contains sufficient informationExamples• abstract syntax tree (AST)• directed acyclic graph (DAG)• control flow graph (CFG)• three address code• stack codeCMSC 430 Lecture 8, Page 2Intermediate representationsBroadly speaking, IRs fall into three categories:Structural• structural IRs are graphically oriented• examples: trees, directed acyclic graphs• heavily used in source to source translators• nodes, edges tend to be largeLinear• pseudo-code for some abstract machine• large variation in level of abstraction• simple, compact data structures• easier to rearrangeHybrids• combination of graphs and linear code• attempt to take best of each• examples: control-flow graphCMSC 430 Lecture 8, Page 3Abstract syntax treeAn abstract syntax tree (AST) is the procedure’s parse tree with the nodes formost non-terminal symbols removed.-<id,x> *<num,2><id,y>This represents “x-2*y”.For ease of manipulation, can use a linearized (operator) form of the tree.x2y*-in postfix form.CMSC 430 Lecture 8, Page 4DAGsAn'AST'with'a'unique'node'for'each'valueEx:x':='2'*'y'+'sin(2*x)z':='x'/'2Uses'less'memory'than'ASTEasy'to'idenDfy'shared'valuesDifficult'to'manipulate,'esp.'tomaintain'shari ng12Directed acyclic graphA directed acyclic graph (DAG) is an AST with a unique node for each value.←<id,x>+*sin<id,y><num,2>*<id,x>←<id,z>/x ← 2*y+sin(2*x)z ← x/2CMSC 430 Lecture 8, Page 5Control flow graphThe control flow graph (CFG) models the transfers of control in the procedure.• nodes in the graph are b a s i c blocksmaximal-length straight-line blocks of code• edges in the graph represent control flowloops, if-then-else, case, gotoExampleif (x=y)then s1else s2s3becomesx=ys1 s2s3CMSC 430 Lecture 8, Page 6Three address codeThree address code generally allow statements of the form:x ← y op zwith a single operator and, at most, three names.Complex expressions likex-2*yare simplified tot1 ← 2*yt2 ← x-t1Advantages• compact form (direct naming)• names for intermediate valuesRegister transfer language (RTL)• only load/store instructions access memory• all other operands are registers• version of three address code for RISCCMSC 430 Lecture 8, Page 7Three address codeTypical statement types1. assignments — x ← y op z2. assignments — x ← op y3. assignments — x ← y[i]4. assignments — x ← y5. branches — goto L6. conditional branches — if x relop ygotoL7. procedure calls — param x and call p8. address and pointer assignmentsCan represent three address code using quadruplesx-2*y(1) load t1 y(2) loadi t2 2(3) mult t3 t2 t1(4)load t4 x(5) sub t5 t4 t2CMSC 430 Lecture 8, Page 8IR


View Full Document
Download Lecture 11 - IR
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 11 - IR and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 11 - IR 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?