DOC PREVIEW
MIT 6 375 - Simple Inelastic and Folded Pipeline

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Simple Inelastic and Folded PipelinesFolded PipelinesArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 14, 2011 L04-1http://csg.csail.mit.edu/6.375Pipelining a blockf2f1f3CombinationalCinQ outQCinQ outQf2f1f3PipelinePinQ outQfFolded PipelineFPClock?Area? Throughput?Clock: C < P  FPArea: FP < C < P Throughput: FP < C < PFebruary 14, 2011L04-2http://csg.csail.mit.edu/6.3752Inelastic Pipelinef1f2f3Both Registers hold values of xsReg1inQf1f2f3sReg2 outQrule sync-pipeline (True);if (inQ.notEmpty())begin sReg1 <= tagged Valid f1(inQ.first()); inQ.deq();endelse sReg1 <= taggedInvalid;hold values of Maybe typecase (sReg1) matchestagged Valid .sx1: sReg2 <= tagged Valid f2(sx1);tagged Invalid: sReg2 <= tagged Invalid; endcasecase (sReg2) matches tagged Valid .sx2: outQ.enq(f3(sx2)); endcaseendruleFebruary 14, 2011L04-3http://csg.csail.mit.edu/6.375When is this rule enabled?rule sync-pipeline (True);if (inQ.notEmpty())begin sReg1 <= tagged Valid f0(inQ.first()); inQ.deq(); endelse sReg1 <= tagged Invalid;ggg;case (sReg1) matchestagged Valid .sx1: sReg2 <= tagged Valid f1(sx1);tagged Invalid: sReg2 <= tagged Invalid; endcasecase (sReg2) matches tagged Valid .sx2: outQ.enq(f2(sx2)); endcaseendruleinQ sReg1 sReg2 outQNEVVNFNEVVFEVVNFEVVFyesNoYes inQ sReg1 sReg2 outQNEVINFNEVIFNEIVNFNEIVFNEIINFNEIIFEVINFEVIFEIVNFEIVFEIINFEIIFFebruary 14, 2011L04-4http://csg.csail.mit.edu/6.3753Pipelining a blockf2f1f3CombinationalCinQ outQCinQ outQf2f1f3PipelinePinQ outQfFolded PipelineFPClock?Area? Throughput?Clock: C < P  FPArea: FP < C < P Throughput: FP < C < PFebruary 14, 2011L04-5http://csg.csail.mit.edu/6.375Folded pipelinefrule folded-pipeline (True);if (stage==0) begin sxIn= inQ.first(); inQ.deq(); endlIRxsReginQoutQstage else sxIn= sReg; sxOut = f(stage,sxIn);if (stage==n-1) outQ.enq(sxOut);else sReg <= sxOut;stage <= (stage==n-1)? 0 : stage+1;endrulenotice stageis a dynamic parameter now!no for-loopNeed type declarations for sxIn and sxOutFebruary 14, 2011L04-6http://csg.csail.mit.edu/6.3754Superfolded pipelineOne Bfly-4 casef will be invoked for 48 dynamic f will be invoked for 48 dynamic values of stage each invocation will modify 4 numbers in sReg after 16 invocations a permutation would be done on the whole sRegwould be done on the whole sRegFebruary 14, 2011L04-7http://csg.csail.mit.edu/6.375Superfolded pipeline: stage function ffunction Vector#(64, Complex) stage_f(Bit#(2) stage, Vector#(64, Complex) stage_in);beginfor (Integer i = 0; i < 16; i = i + 1)begin Bit#(2) stageInteger idx = i * 4;let twid = getTwiddle(stage, fromInteger(i));let y = bfly4(twid, stage_in[idx:idx+3]);stage_temp[idx] = y[0]; stage_temp[idx+1] = y[1];stage_temp[idx+2] = y[2]; stage_temp[idx+3] = y[3];endend//Permutationfor (Integer i = 0; i < 64; i = i + 1)stage_out[i] = stage_temp[permute[i]];endreturn(stage_out);February 14, 2011L04-8http://csg.csail.mit.edu/6.3755Code for the Superfolded pipeline stage functionFunction Vector#(64, Complex) f(Bit#(6) stagei, Vector#(64, Complex) stage_in);leti=stagei`mod`16;leti= stageimod 16;let twid = getTwiddle(stagei `div` 16, i);let y = bfly4(twid, stage_in[i:i+3]);let stage_temp = stage_in;stage_temp[i] = y[0];stage_temp[i+1] = y[1];stage_temp[i+2] = y[2];stage temp[i+3] = y[3];One Bfly-4 caseg_ p[]y[];let stage_out = stage_temp;if (i == 15)for (Integer i = 0; i < 64; i = i + 1)stage_out[i] = stage_temp[permute[i]];return(stage_out); endfunctionFebruary 14, 2011L04-9http://csg.csail.mit.edu/6.375Folded pipeline: stage function fgetTwiddle0Th t f stagegetTwiddle0 getTwiddle1getTwiddle2twid(shared)The rest of stage_f, i.e. Bfly-4s and permutations (shared)sThe Twiddle constants can be expressed in a table or in a case or nested case expressionsxFebruary 14, 2011L04-10http://csg.csail.mit.edu/6.3756802.11a Transmitter [MEMOCODE 2006] Dave, Gerding, Pellauer, ArvindDesign Lines of RelativeBlock Code (BSV)AreaBlock Code (BSV)AreaController 49 0%Scrambler 40 0%Conv. Encoder 113 0%Interleaver 76 1%Mapper 112 11%Mapper 112 11%IFFT 95 85%Cyc. Extender 23 3%Complex arithmetic libraries constitute another 200 lines of codeFebruary 14, 2011L04-11http://csg.csail.mit.edu/6.375802.11a Transmitter Synthesis results (Only the IFFT block is changing)IFFT Design Area (mm2)Throughput Latency(CLKs/sym)Min. Freq Required(CLKs/sym)Pipelined 5.25 04 1.0 MHzCombinational 4.91 04 1.0 MHzFolded(16 Bfly-4s)3.97 04 1.0 MHzSuper-Folded 3.69 06 1.5 MHzThe All these designs were done in less than 24 hours!(8 Bfly-4s)SF(4 Bfly-4s) 2.45 12 3.0 MHzSF(2 Bfly-4s) 1.84 24 6.0 MHzSF (1 Bfly4) 1.52 48 12 MHZTSMC .18 micron; numbers reported are before place and route.same source codeFebruary 14, 2011L04-12http://csg.csail.mit.edu/6.3757Why are the areas so similarFolding should have given a 3x Folding should have given a 3x improvement in IFFT areaFebruary 14, 2011L04-13http://csg.csail.mit.edu/6.375Elastic pipelineUse FIFOs instead of pipeline registersf1f2f3xfifo1inQf1f2f3fifo2 outQrule stage1 (True); fifo1.enq(f1(inQ.first()); inQ.deq(); endrulerulestage2 (True);Firing conditions? Can tokens be left inside the pipeline?rulestage2 (True);fifo2.enq(f2(fifo1.first());  fifo1.deq(); endrulerule stage3 (True); outQ.enq(f3(fifo2.first());  fifo2.deq(); endruleEasier to write?No Maybe types?Can all three rules fire concurrently?February 14, 2011L04-14http://csg.csail.mit.edu/6.3758What behavior do we want?f1f2f3xfifo1inQf1f2f3fifo2 outQinQ fifo1 fifo2 outQNE NE,NF NE,NF NFNE NE,NF NE,NF FNENE NFNE FNFYes Yes YesYes Yes Norule1 rule2 rule3NENE,NFNE,FNFNE NE,NF NE,F F….February 14, 2011L04-15http://csg.csail.mit.edu/6.375module mkFIFO1 (FIFO#(t));Reg#(t) data <- mkRegU(); #()()One-Element FIFOReg#(Bool) full <-mkReg(False);method Action enq(t x) if (!full);full <= True; data <= x;endmethodmethod Action deq() if (full);full <= False;endmethodthdtfi t()if(fll)nmethod t first() if (full);return (data);endmethodmethod Action clear();full <= False;endmethodendmodulenot emptynot fullrdyenabrdyenabenqdeqFIFOmoduleJanuary 18, 2011L4-16http://csg.csail.mit.edu/SNU9Concurrency when the FIFOs do not permit concurrent enq and deqf1f2f3xfifo1inQf1f2f3fifo2 outQnot emptynot empty&not fullnot empty&not fullnot fullJanuary 18, 2011L4-17http://csg.csail.mit.edu/SNUInelastic vs Elastic PipelinesIn an Inelastic pipeline:pp typically only one rule; the


View Full Document

MIT 6 375 - Simple Inelastic and Folded Pipeline

Documents in this Course
IP Lookup

IP Lookup

15 pages

Verilog 1

Verilog 1

19 pages

Verilog 2

Verilog 2

23 pages

Encoding

Encoding

21 pages

Quiz

Quiz

10 pages

IP Lookup

IP Lookup

30 pages

Load more
Download Simple Inelastic and Folded Pipeline
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Simple Inelastic and Folded Pipeline and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Simple Inelastic and Folded Pipeline 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?