DOC PREVIEW
CMU CS 15745 - Software Pipelining

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

15-745 © Seth Copen Goldstein 2000-5 115-745Software Pipelining15-745 Optimizing CompilerSpring 2006Peter LeeCopyright © Seth Copen Goldstein 2000-5(some slides borrowed from M. Voss)15-745 © Seth Copen Goldstein 2000-5 2Trace Schedulingx>10?a=b+cd=a-3 a=b+cf=a+3a=b+cx>10?d=a-3 f=a+3a=e*fd=a-3a=b+cx=x+1a=e*fd=a-3……a=b+cx=x+1d=a-315-745 © Seth Copen Goldstein 2000-5 3Software Pipelining• Software pipelining is an IS technique that reordersthe instructions in a loop.– Possibly moving instructions from one iteration tothe previous or the next iteration.– Very large improvements in running time are possible.• The first serious approach to software pipelining waspresented by Aiken & Nicolau.– Aiken’s 1988 Ph.D. thesis.– Impractical as it ignores resource hazards (focusingonly on data-dependence constraints).• But sparked a large amount of follow-on research.15-745 © Seth Copen Goldstein 2000-5 4Goal of SP• Increase distance between dependentoperations by moving destination operation to alater iterationA: a ! ld [d]B: b ! a * aC: st [d], bD: d ! d + 4Assume all have latency of 2BA C D15-745 © Seth Copen Goldstein 2000-5 5Can we decrease the latency?• Lets unrollA: a ! ld [d]B: b ! a * aC: st [d], bD: d ! d + 4A1: a ! ld [d]B1: b ! a * aC1: st [d], bD1: d ! d + 4DCBA B1A1 C1 D115-745 © Seth Copen Goldstein 2000-5 6Rename variablesA: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d ! d1 + 4DCBA B1A1 C1 D115-745 © Seth Copen Goldstein 2000-5 7ScheduleA: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d ! d1 + 4ABCDA1B1C1D1B1CA1BDAC1D115-745 © Seth Copen Goldstein 2000-5 8Unroll Some MoreA: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d2 ! d1 + 4A2: a2 ! ld [d2]B2: b2 ! a2 * a2C2: st [d2], b2D2: d ! d2 + 4ABCDA1B1C1D1A2B2C2D2C1B1A1DCA2BD1AC2B2D215-745 © Seth Copen Goldstein 2000-5 9Unroll Some MoreA: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d2 ! d1 + 4A2: a2 ! ld [d2]B2: b2 ! a2 * a2C2: st [d2], b2D2: d ! d2 + 4ABCDA1B1C1D1A2B2C2D3C2B2A2D1C1B1A1DA3CD2BAC3B3D3D2A3B3C315-745 © Seth Copen Goldstein 2000-5 10One More TimeABCDA1B1C1D1A2B2C2D3C3B3A3D2B4C2B2A2D1C1B1A1DCD3BAC4A4D4D2A3B3C3A4B4C415-745 © Seth Copen Goldstein 2000-5 11Can RearrangeABCDA1B1C1D1A2B2C2D3C3B3A3D2B4C2B2A2D1C1B1A1DCD3BAC4A4D4D2A3B3C3A4B4C415-745 © Seth Copen Goldstein 2000-5 12RearrangeABCDA1B1C1D1A2B2C2D3B3C2B2A2D1C1B1A1DD2CBAC3A3D3D2A3B3C3A: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d2 ! d1 + 4A2: a2 ! ld [d2]B2: b2 ! a2 * a2C2: st [d2], b2D2: d ! d2 + 415-745 © Seth Copen Goldstein 2000-5 13RearrangeABCDA1B1C1D1A2B2C2D3B3C2B2A2D1C1B1A1DD2CBAC3A3D3D2A3B3C3A: a ! ld [d]B: b ! a * aC: st [d], bD: d1 ! d + 4A1: a1 ! ld [d1]B1: b1 ! a1 * a1C1: st [d1], b1D1: d2 ! d1 + 4A2: a2 ! ld [d2]B2: b2 ! a2 * a2C2: st [d2], b2D2: d ! d2 + 415-745 © Seth Copen Goldstein 2000-5 14SP LoopA: a ! ld [d]B: b ! a * aD: d1 ! d + 4A1: a1 ! ld [d1]D1: d2 ! d1 + 4C: st [d], bB1: b1 ! a1 * a1A2: a2 ! ld [d2]D2: d ! d2 + 4B2: b2 ! a2 * a2C1: st [d1], b1D3: d2 ! d1 + 4C2: st [d2], b2D2A2B1CC2B2A2A2D1C1B1B1A1DD2CBAD2D3CPrologBodyEpilog15-745 © Seth Copen Goldstein 2000-5 15Goal of SP• Increase distance between dependentoperations by moving destination operation to alater iterationABCdependenciesin initial loopABCiteration i i+1 i+2after SP15-745 © Seth Copen Goldstein 2000-5 16ExampleAssume operating on a infinite wide machineA0A1 B0A2 B1 C0A3 B2 C1B3 C2C3A0A1 B0AiBi-1Ci-2BiCi-1Ci15-745 © Seth Copen Goldstein 2000-5 17ExampleAssume operating on a infinite wide machineA0A1 B0AiBi-1Ci-2BiCi-1CiPrologepilogloop body15-745 © Seth Copen Goldstein 2000-5 18for (i=0; i<N; i++){AiBiCi}Dealing with exit conditionsi=0if (i >= N) goto doneA0B0if (i+1 == N) goto lasti=1A1if (i+2 == N) goto epilogi=2loop:AiBi-1Ci-2i++if (i < N) goto loopepilog:BiCi-1last:cidone:15-745 © Seth Copen Goldstein 2000-5 19Loop Unrolling V. SPFor SuperScalar• Loop Unrolling reduces loop overhead• Software Pipelining reduces fill/drain• Best is if you combine themSoftware PipeliningLoop Unrolling# ofoverlappediterationsTime15-745 © Seth Copen Goldstein 2000-5 20Aiken/Nicolau SchedulingStep 1Perform scalar replacement to eliminate memoryreferences where possible.for i:=1 to N do a := j " V[i-1] b := a " f c := e " j d := f " c e := b " d f := U[i] g: V[i] := b h: W[i] := d j := X[i]for i:=1 to N do a := j " b b := a " f c := e " j d := f " c e := b " d f := U[i] g: V[i] := b h: W[i] := d j := X[i]15-745 © Seth Copen Goldstein 2000-5 21Aiken/Nicolau SchedulingStep 2Unroll the loop and compute the data-dependencegraph (DDG).DDG for rolled loop:for i:=1 to N do a := j " b b := a " f c := e " j d := f " c e := b " d f := U[i] g: V[i] := b h: W[i] := d j := X[i]abcdefghj15-745 © Seth Copen Goldstein 2000-5 22Aiken/Nicolau SchedulingStep 2, cont’dDDG for unrolled loop:for i:=1 to N do a := j " b b := a " f c := e " j d := f " c e := b " d f := U[i] g: V[i] := b h: W[i] := d j := X[i]a1b1c1d1e1j1f1g1h1a2b2a3b3g2j2f2c2d2e2h1c315-745 © Seth Copen Goldstein 2000-5 23Aiken/Nicolau SchedulingStep 3Build a tableau of iteration number vs cycle time.acfj fj fj fj fj fjbdegh a cb dg a eh b cg a d b eh g a c b d g a eh b c g d eh iteration 1 2 3 4 5 6123456789101112131415cyclea1b1c1d1e1j1f1g1h1a2b2a3b3g2j2f2c2d2e2h1c315-745 © Seth Copen Goldstein 2000-5 24Aiken/Nicolau SchedulingStep 4Find repeating patterns of instructions.acfj fj fj fj fj fjbdegh a cb dg a eh b cg a d b eh g a c b d g a eh b c g d eh iteration 1 2 3 4 5 6123456789101112131415cycle15-745 © Seth Copen Goldstein 2000-5 25Aiken/Nicolau SchedulingStep 4Find repeating patterns of instructions.acfj fj fj fj fj fjbdegh a cb dg a eh b cg a d b eh g a c b d g a eh b c g d eh iteration 1 2 3 4 5 6123456789101112131415cycle15-745 © Seth Copen Goldstein 2000-5 26Aiken/Nicolau SchedulingStep 5“Coalesce” the


View Full Document

CMU CS 15745 - Software Pipelining

Documents in this Course
Lecture

Lecture

14 pages

Lecture

Lecture

19 pages

Lecture

Lecture

8 pages

Lecture

Lecture

5 pages

Lecture

Lecture

6 pages

lecture

lecture

17 pages

Lecture 3

Lecture 3

12 pages

Lecture

Lecture

17 pages

Lecture

Lecture

18 pages

lecture

lecture

14 pages

lecture

lecture

8 pages

lecture

lecture

5 pages

Lecture

Lecture

19 pages

lecture

lecture

10 pages

Lecture

Lecture

20 pages

Lecture

Lecture

8 pages

Lecture

Lecture

7 pages

lecture

lecture

59 pages

Lecture

Lecture

10 pages

Task 2

Task 2

2 pages

Handout

Handout

18 pages

Load more
Download Software Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Software Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Software Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?