DOC PREVIEW
MIT 6 375 - Study Notes

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 28 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Pipelining a blockInelastic PipelineWhen is this rule enabled?Pipelining a blockFolded pipelineSuperfolded pipeline One Bfly-4 caseSuperfolded pipeline: stage function fCode for the Superfolded pipeline stage functionFolded pipeline: stage function fSlide 11Slide 12Why are the areas so similarElastic pipeline Use FIFOs instead of pipeline registersWhat behavior do we want?One-Element FIFOConcurrency when the FIFOs do not permit concurrent enq and deqInelastic vs Elastic PipelinesThe tensionLanguage notesSlide 21Syntax: Vector of RegistersMaking guards explicitImplicit guards (conditions)Guards vs If’sStatic vs dynamic expressionsGeneralization: n-stage pipelineSlide 28Simple Inelastic and Folded PipelinesArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 14, 2011 L04-1http://csg.csail.mit.edu/6.375Pipelining a blockinQ outQf2f1f3CombinationalCinQ outQf2f1f3PipelinePinQ outQfFolded PipelineFPClock?Area? Throughput?Clock: C < P  FP Area: FP < C < P Throughput: FP < C < PFebruary 14, 2011L04-2http://csg.csail.mit.edu/6.375Inelastic PipelinexsReg1inQf1 f2 f3sReg2 outQrule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= tagged Valid f1(inQ.first()); inQ.deq();end else sReg1 <= tagged Invalid; case (sReg1) matches tagged Valid .sx1: sReg2 <= tagged Valid f2(sx1); tagged Invalid: sReg2 <= tagged Invalid; endcase case (sReg2) matches tagged Valid .sx2: outQ.enq(f3(sx2)); endcaseendruleBoth Registers hold values of Maybe typeFebruary 14, 2011L04-3http://csg.csail.mit.edu/6.375When is this rule enabled?rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= tagged Valid f0(inQ.first()); inQ.deq(); end else sReg1 <= tagged Invalid; case (sReg1) matches tagged Valid .sx1: sReg2 <= tagged Valid f1(sx1); tagged Invalid: sReg2 <= tagged Invalid; endcase case (sReg2) matches tagged Valid .sx2: outQ.enq(f2(sx2)); endcaseendruleinQ sReg1 sReg2 outQNE V V NFNE V V FNE V I NFNE V I FNE I V NFNE I V FNE I I NFNE I I FE V V NFE V V FE V I NFE V I FE I V NFE I V FE I I NFE I I FyesNoYesYesYesNoYesyesyesNoYesYesYesNoYes1yesYes1 = yes but no changeinQ sReg1 sReg2 outQFebruary 14, 2011L04-4http://csg.csail.mit.edu/6.375Pipelining a blockinQ outQf2f1f3CombinationalCinQ outQf2f1f3PipelinePinQ outQfFolded PipelineFPClock?Area? Throughput?Clock: C < P  FP Area: FP < C < P Throughput: FP < C < PFebruary 14, 2011L04-5http://csg.csail.mit.edu/6.375Folded pipelinerule folded-pipeline (True); if (stage==0) begin sxIn= inQ.first(); inQ.deq(); end else sxIn= sReg; sxOut = f(stage,sxIn); if (stage==n-1) outQ.enq(sxOut); else sReg <= sxOut; stage <= (stage==n-1)? 0 : stage+1;endrulexsReginQfoutQstagenotice stage is a dynamic parameter now!no for-loopNeed type declarations for sxIn and sxOutFebruary 14, 2011L04-6http://csg.csail.mit.edu/6.375Superfolded pipeline One Bfly-4 casef will be invoked for 48 dynamic values of stageeach invocation will modify 4 numbers in sRegafter 16 invocations a permutation would be done on the whole sRegFebruary 14, 2011L04-7http://csg.csail.mit.edu/6.375Superfolded pipeline: stage function ffunction Vector#(64, Complex) stage_f (Bit#(2) stage, Vector#(64, Complex) stage_in); begin for (Integer i = 0; i < 16; i = i + 1) begin Bit#(2) stage Integer idx = i * 4; let twid = getTwiddle(stage, fromInteger(i)); let y = bfly4(twid, stage_in[idx:idx+3]); stage_temp[idx] = y[0]; stage_temp[idx+1] = y[1]; stage_temp[idx+2] = y[2]; stage_temp[idx+3] = y[3]; end //Permutation for (Integer i = 0; i < 64; i = i + 1) stage_out[i] = stage_temp[permute[i]]; endreturn(stage_out);Bit#(2+4) (stage,i)should be done only when i=15 February 14, 2011L04-8http://csg.csail.mit.edu/6.375Code for the Superfolded pipeline stage functionFunction Vector#(64, Complex) f (Bit#(6) stagei, Vector#(64, Complex) stage_in); let i = stagei `mod` 16; let twid = getTwiddle(stagei `div` 16, i); let y = bfly4(twid, stage_in[i:i+3]); let stage_temp = stage_in; stage_temp[i] = y[0]; stage_temp[i+1] = y[1]; stage_temp[i+2] = y[2]; stage_temp[i+3] = y[3]; let stage_out = stage_temp; if (i == 15) for (Integer i = 0; i < 64; i = i + 1) stage_out[i] = stage_temp[permute[i]]; return(stage_out); endfunctionOne Bfly-4 caseFebruary 14, 2011L04-9http://csg.csail.mit.edu/6.375Folded pipeline: stage function fThe Twiddle constants can be expressed in a table or in a case or nested case expressionstage getTwiddle0 getTwiddle1getTwiddle2twidThe rest of stage_f, i.e. Bfly-4s and permutations (shared)The rest of stage_f, i.e. Bfly-4s and permutations (shared)sxFebruary 14, 2011L04-10http://csg.csail.mit.edu/6.375802.11a Transmitter [MEMOCODE 2006] Dave, Gerding, Pellauer, ArvindDesign Lines of RelativeBlock Code (BSV) AreaController 49 0%Scrambler 40 0%Conv. Encoder 113 0%Interleaver 76 1%Mapper 112 11%IFFT 95 85%Cyc. Extender 23 3%Complex arithmetic libraries constitute another 200 lines of codeFebruary 14, 2011L04-11http://csg.csail.mit.edu/6.375802.11a Transmitter Synthesis results (Only the IFFT block is changing)IFFT Design Area (mm2)Throughput Latency(CLKs/sym)Min. Freq RequiredPipelined 5.25 04 1.0 MHzCombinational 4.91 04 1.0 MHzFolded(16 Bfly-4s)3.97 04 1.0 MHzSuper-Folded(8 Bfly-4s)3.69 06 1.5 MHzSF(4 Bfly-4s) 2.45 12 3.0 MHzSF(2 Bfly-4s) 1.84 24 6.0 MHzSF (1 Bfly4) 1.52 48 12 MHZTSMC .18 micron; numbers reported are before place and route.The same source codeAll these designs were done in less than 24 hours!February 14, 2011L04-12http://csg.csail.mit.edu/6.375Why are the areas so similarFolding should have given a 3x improvement in IFFT areaBUT a constant twiddle allows low-level optimization on a Bfly-4 blocka 2.5x area reduction!February 14, 2011L04-13http://csg.csail.mit.edu/6.375Elastic pipelineUse FIFOs instead of pipeline registersxfifo1inQf1 f2 f3fifo2outQrule stage1 (True); fifo1.enq(f1(inQ.first()); inQ.deq(); endrulerule stage2 (True);fifo2.enq(f2(fifo1.first());  fifo1.deq(); endrulerule stage3 (True); outQ.enq(f3(fifo2.first());  fifo2.deq(); endruleFiring conditions? Can tokens be left inside the pipeline?Easier to write?No Maybe types?Can all three rules fire concurrently?February 14, 2011L04-14http://csg.csail.mit.edu/6.375What behavior do we want?xfifo1inQf1 f2 f3fifo2outQMaximize concurrency - Fire maximum number of rulesinQ fifo1 fifo2


View Full Document

MIT 6 375 - Study Notes

Documents in this Course
IP Lookup

IP Lookup

15 pages

Verilog 1

Verilog 1

19 pages

Verilog 2

Verilog 2

23 pages

Encoding

Encoding

21 pages

Quiz

Quiz

10 pages

IP Lookup

IP Lookup

30 pages

Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?