CMU CS 15745 - Lecture - D2782002

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15745> Lecture

DOC PREVIEW

CMU CS 15745 - Lecture

School name Carnegie Mellon University

Course Cs 15745- Optimizing Compilers for Modern Architectures

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Loop-invariant Code Motion15-745 Optimizing CompilersSpring 2006Peter LeeRemindersTask 1 test programs are dueTask 1 due in one weekSee the Internals doc and Advice ColumnRead 7.1-4 (control-flow analysis) and 13.2 (loop-invariant code motion)LoopsLoops are extremely importantthe “90-10” ruleLoop optimization involvesunderstanding control-flow structuresensitivity to side-effecting operationsextra care in some transformations such as register spillingClassical loop optimizationsHoisting of loop-invariant computationspre-compute before entering the loopElimination of induction variableschange p=i*w+b to p=b+w, when w,b invariantElimination of null and array-bounds checksuse laws of arithmetic to prove integer rangeLoop unrollingto reduce number of control transfersLoop permutationto improve cache memory performanceFinding loopsTo optimize loops, we need to find them!Specifically:loop-header node(s)nodes in a loop that have immediate predecessors not in the loopback edge(s)control-flow edges to previously executed nodesall nodes in the loop bodyControl-flow analysisL3 has only well-structured control-flow constructsFinding L3 loops is easythe translator can mark every header node and back edge when creating the IRBut many languages have goto and other complex control, so loops can be hard to findDetermining the control structure of a program is called control-flow analysisTask noteWe will describe here the classical approach to control-flow analysis for imperative, first-order languagesThis is a general approach, suitable even for languages with gotoBut for L3, it is much easier simply to have the translator identify any loops it createsTerminology alertdominators and dominator treesback edgeloop headernatural loopDominatorsa dom bnode a dominates b if every possible execution path from entry to b includes aa sdom ba strictly dominates b if a dom b and a!ba idom ba immediately dominates b if there is no c such that a sdom c and c sdom bSome propertiesidom(n) is uniqueThe dom relation is a partial orderingreflexive, antisymmetric, and transitiveentryk = falsei = 1j = 2i <= nj = j * 2k = truei = i + 1…k…print j i = i + 1exitB1B2B3B4B5B6A control-flow edge from node b to a is a back edge if a dom bFurthermore, in that case node a is a loop headerBack edges and loop headersNatural loopConsider a back edge from node n to node hThe natural loop of n!h is the set of nodes L such that for all x󲰉L:h dom x andthere is a path from x to n not containing hA simple example...B1B2B3B4nested loopsWhat about this case?B1B2B3B4loop with “continue”What about this case?B1B2 B3B4conditional in loopNested loopsNormally we will want to focus attention on the inner-most loopsThis requires identifying not only the loops, but the nesting structureDominator treesObserve: Every node has at most one immediate dominatorTherefore: the immediate dominator relation defines a tree structurenode n is the parent of node m if n idom mentryk = falsei = 1j = 2i <= nj = j * 2k = truei = i + 1…k…print j i = i + 1exitB1B2B3B4B5B6entryB1B2B3 B4B5 B6exitcontrol-flow graph dominator treeBack edges point to ancestors in the tree natural loopLimitations of natural loopsThe notion of natural loop is only approximateSpecifically, consider the case of two natural loops with the same header:B1B2 B3B4B1B2 B3B4while (...) if (p) {...} else {...}while (...) if (i<j) {...; i++;} else if (i>j) {...; i--;}What if p is loop invariant?Nested loops?In general, when there is a shared header, will consider this a single loopABCDA,B,C,DB,CIn a loop-nest tree, each node represents the blocks of a loop, and parent nodes are enclosing loopsThe leaves of the tree are the inner-most loopsComputing dominatorsObserve: if a dom b, thena = b, ora is the only immediate predecessor of b, orb has more than one immediate predecessor, all of which are dominated by aabp2p1dom(b) = {b} 󲰏 󲰐 dom(p)p󲰉pred(b)Simple algorithmdom(entry) = {entry}dom(n) = D = all nodes changed = true while (changed) { changed = false for each n!entry { old = D D = {n} 󲰏 ! dom(p) if D ! old then changed = true } } return Dp󲰉pred(n)Computing idomidom(n) = D = all nodes s such that s sdom n for each x 󲰉 D { for each y 󲰉 D-{x} { if y 󲰉 sdom(x) then D = D = {y} } } return DBetter algorithmsComputing dominators is a classic problem in the study of algorithmsThe idom algorithm presented here runs in O(e·n2), for a graph with n nodes and e edgesLengauer and Tarjan, in 1979, presented algorithms that run in O(e·log(n)) or betterLoop optimizations:Hoisting of loop-invariant computationsLoop-invariant computationsA definitiont = x 󲰟 yin a loop is loop-invariant ifx and y are constants, orall reaching definitions of x and y are outside the loop, oronly one definition reaches x (or y), and that definition is loop-invariantHoistingIn order to “hoist” a loop-invariant computation out of a loop, we need a place to put itWe could copy it to all immediate predecessors of the loop header......But we can avoid code duplication by inserting a new block, called the pre-headerABCDABCDA’B’pre-headersHoisting conditionsFor a loop-invariant definitiond: t = x 󲰟 ywe can hoist d into the loop’s pre-header if1.d’s block dominates all loop exits at which t is live-out, and2.there is only one definition of t in the loop, and3.t is not live-out of the pre-headerWe need to be careful...All hoisting conditions must be satisfied!L0: t = 0L1: i = i + 1 t = a * b M[i] = t if i<N goto L1L2: x = tL0: t = 0L1: if i>=N goto L2 i = i + 1 t = a * b M[i] = t goto L1L2: x = t L0: t = 0L1: i = i + 1 t = a * b M[i] = t t = 0 M[j] = t if i<N goto L1L2:OK violates 1,3 violates 2 Next time...Induction-variable eliminationBounds-checking

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

CMU CS 15745 - Lecture

Sign up for free to view:

Please select your school