DOC PREVIEW
MIT 6 375 - Elastic Pipelines-Concurrency Issues

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Inelastic vs Elastic PipelinesProcessor Pipelines and FIFOsSFIFO (glue between stages)Two-Stage PipelineRules for AddFetch & Decode Rule: ReexaminedFetch & Decode Rule: correctedRules for BranchFetch & Decode RuleThe Stall SignalThe findf functionExecute RuleConcurrencyThe tensionThe compiler issueExecution rules Split the execution rule for analysisConcurrency analysis Add RuleWhat concurrency do we want?Concurrency analysis Branch RulesConcurrency analysis Load-Store RulesProperties Required of Register File and FIFO for Instruction PipeliningOne Element Searchable Pipeline SFIFORegister File concurrency propertiesBypass Register FileSince our rules do not really require a Bypass Register File, the overhead of bypassing can be avoided by simply using the “Config Regfile”Concurrency analysis Two-stage PipelineLot of nontrivial analysis but no change in processor code!BypassingThe stall function for the elastic pipelineElastic Pipelines: Concurrency IssuesArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyMarch 3, 2010 L09-1http://csg.csail.mit.edu/6.375Inelastic vs Elastic PipelinesIn a Inelastic pipeline:typically only one rule; the designer controls precisely which activities go on in paralleldownside: The rule can get too complicated -- easy to make a mistake; difficult to make changesIn an Elastic pipeline:several smaller rules, each easy to write, easier to make changesdownside: sometimes rules do not fire concurrently when they shouldEasy: cycle-level concurrencyDifficult: precise functional correctness Easy: functional correctness Difficult: precise cycle-level concurrencyMarch 3, 2010 L09-2http://csg.csail.mit.edu/6.375Processor Pipelines and FIFOsfetchexecute iMemrfCPUdecodememorypcwrite-backdMemIt is better to think in terms of FIFOs as opposed to pipeline registers. March 3, 2010 L09-3http://csg.csail.mit.edu/6.375SFIFO (glue between stages)interface SFIFO#(type t, type tr); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Bool find(tr); // search FIFOendinterface n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr"not fullnot emptynot emptyrdyenabnnrdyenabrdyenqdeqfirstSFIFOmoduleclearenabfindmboolmore on searchable FIFOs later March 3, 2010 L09-4http://csg.csail.mit.edu/6.375Two-Stage Pipeline fetch & decodeexecutepcrfCPUbumodule mkCPU#(Mem iMem, Mem dMem)(Empty); Reg#(Iaddress) pc <- mkReg(0);RegFile#(RName, Bit#(32)) rf <- mkRegFileFull();SFIFO#(InstTemplate, RName) bu <- mkSFifo(findf); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; InstTemplate it = bu.first(); rule fetch_decode ...endmodule March 3, 2010 L09-5http://csg.csail.mit.edu/6.375Rules for Add rule decodeAdd(instr matches Add{dst:.rd,src1:.ra,src2:.rb}) bu.enq (EAdd{dst:rd,op1:rf[ra],op2:rf[rb]}); pc <= predIa;endrulerule executeAdd(it matches EAdd{dst:.rd,op1:.va,op2:.vb}) rf.upd(rd, va + vb); bu.deq();endruleimplicit check:implicit check:fetch & decodeexecutepcrfCPUbubu notfullbu notemptyMarch 3, 2010 L09-6http://csg.csail.mit.edu/6.375Fetch & Decode Rule: Reexamined Wrong! Because instructions in bu may be modifying ra or rbstall !fetch & decodeexecutepcrfCPUburule decodeAdd (instr matches Add{dst:.rd,src1:.ra,src2:.rb}) bu.enq (EAdd{dst:rd, op1:rf[ra], op2:rf[rb]}); pc <= predIa;endruleMarch 3, 2010 L09-7http://csg.csail.mit.edu/6.375Fetch & Decode Rule: correctedfetch & decodeexecutepcrfCPUburule decodeAdd (instr matches Add{dst:.rd,src1:.ra,src2:.rb} bu.enq (EAdd{dst:rd, op1:rf[ra], op2:rf[rb]}); pc <= predIa;endrule&&& !bu.find(ra) &&& !bu.find(rb))March 3, 2010 L09-8http://csg.csail.mit.edu/6.375Rules for Branch rule decodeBz(instr matches Bz{condR:.rc,addrR:.addr}) &&& !bu.find(rc) &&& !bu.find(addr)); bu.enq (EBz{cond:rf[rc],tAddr:rf[addr]}); pc <= predIa; endrulerule bzTaken(it matches EBz{cond:.vc,tAddr:.va}) &&& (vc==0)); pc <= va; bu.clear(); endrulerule bzNotTaken (it matches EBz{cond:.vc,tAddr:.va}) &&& (vc != 0)); bu.deq; endrulefetch & decodeexecutepcrfCPUburule-atomicity ensures thatpc update, anddiscard of pre-fetched instrs in bu, are doneconsistentlyMarch 3, 2010 L09-9http://csg.csail.mit.edu/6.375Fetch & Decode Rule function InstrTemplate newIt(Instr instr); case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: return EAdd{dst:rd,op1:rf[ra],op2:rf[rb]}; tagged Bz {condR:.rc,addrR:.addr}: return EBz{cond:rf[rc],tAddr:rf[addr]}; tagged Load {dst:.rd,addrR:.addr}: return ELoad{dst:rd,addrR:rf[addr]}; tagged Store{valueR:.v,addrR:.addr}: return EStore{val:rf[v],addr:rf[addr]}; endcase endfunction rule fetch_and_decode (!stallFunc(instr, bu)); bu.enq(newIt(instr)); pc <= predIa;endrule Same as beforeMarch 3, 2010 L09-10http://csg.csail.mit.edu/6.375The Stall SignalBool stall = stallFunc(instr, bu);This need to search the contents of the FIFO is why we need an SFIFO, not just a FIFOfunction Bool stallFunc (Instr instr, SFIFO#(InstTemplate, RName) bu); case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: return (bu.find(ra) || bu.find(rb)); tagged Bz {condR:.rc,addrR:.addr}: return (bu.find(rc) || bu.find(addr)); tagged Load {dst:.rd,addrR:.addr}: return (bu.find(addr)); tagged Store {valueR:.v,addrR:.addr}: return (bu.find(v)) || bu.find(addr)); endcaseendfunctionMarch 3, 2010 L09-11http://csg.csail.mit.edu/6.375The findf functionWhen we make a searchable FIFO we need to supply a function that determines if a register is going to be updated by an instruction templatemkSFifo can be parameterized by such a search functionSFIFO#(InstrTemplate, RName) bu <- mkSFifo(findf);function Bool findf (RName r, InstrTemplate it); case (it) matches tagged EAdd{dst:.rd,op1:.v1,op2:.v2}: return (r == rd); tagged EBz {cond:.c,tAddr:.a}: return (False); tagged ELoad{dst:.rd,addr:.a}: return (r == rd); tagged EStore{val:.v,addr:.a}: return (False); endcase endfunctionSame as beforeMarch 3, 2010 L09-12http://csg.csail.mit.edu/6.375Execute Rulerule execute (True); case (it) matches tagged EAdd{dst:.rd,op1:.va,op2:.vb}: begin rf.upd(rd,


View Full Document

MIT 6 375 - Elastic Pipelines-Concurrency Issues

Documents in this Course
IP Lookup

IP Lookup

15 pages

Verilog 1

Verilog 1

19 pages

Verilog 2

Verilog 2

23 pages

Encoding

Encoding

21 pages

Quiz

Quiz

10 pages

IP Lookup

IP Lookup

30 pages

Load more
Download Elastic Pipelines-Concurrency Issues
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Elastic Pipelines-Concurrency Issues and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Elastic Pipelines-Concurrency Issues 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?