1February 27, 2008 http://csg.csail.mit.edu/6.375 L09-1Modeling Processors: Concurrency IssuesArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 27, 2008L09-2http://csg.csail.mit.edu/6.375The PlanTwo-stage synchronous pipeline ⇐ Bypassing issuesTwo-stage asynchronous pipeline Concurrency IssuesSome understanding of simple processor pipelines is needed to follow this lecture2February 27, 2008L09-3http://csg.csail.mit.edu/6.375SynchronousPipelinerule SyncTwoStage (True);let instr = iMem.read(pc); let stall = stallfuncR(instr,buReg);let fetchAction = action if(!stall) pc <= predIa;buReg <= (stall) ? Invalid : Valid newIt(instr);endaction;case (buReg) matches …endcaseendcase endrulefetch & decodeexecutepcrfCPUbuRegFebruary 27, 2008L09-4http://csg.csail.mit.edu/6.375SynchronousPipelinerule SyncTwoStage (True);…case (buReg) matches tagged Invalid: fetchAction; tagged Valid .it: begincase (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: beginrf.upd(rd, va+vb); fetchAction; end tagged EBz {cond:.cv,addr:.av}:if (cv == 0) then beginpc <= av; buReg <= Invalid; end else fetchAction;tagged ELoad{dst:.rd,addr:.av}: beginrf.upd(rd, dMem.read(av)); fetchAction; endtagged EStore{value:.vv,addr:.av}: begindMem.write(av, vv); fetchAction; endendcaseendcase endrulefetch & decodeexecutepcrfCPUbuReg3February 27, 2008L09-5http://csg.csail.mit.edu/6.375The Stall Functionfunction Bool stallfuncR (Instr instr, Maybe#(InstTemplate) buReg); case (buReg) matchestagged Invalid: return False;tagged Valid .it: case (instr) matchestagged Add {dst:.rd,src1:.ra,src2:.rb}: return (findf(ra,it) || findf(rb,it));tagged Bz {cond:.rc,addr:.addr}: return (findf(rc,it) || findf(addr,it));tagged Load {dst:.rd,addr:.addr}: return (findf(addr,it));tagged Store {value:.v,addr:.addr}: return (findf(v,it) || findf(addr,it));endcaseendfunctionFebruary 27, 2008L09-6http://csg.csail.mit.edu/6.375The findf functionfunction Bool findf (RName r, InstrTemplate it); case (it) matchestagged EAdd{dst:.rd,op1:.v1,op2:.v2}: return (r == rd); tagged EBz {cond:.c,addr:.a}: return (False);tagged ELoad{dst:.rd,addr:.a}: return (r == rd);tagged EStore{value:.v,addr:.a}: return (False);endcase endfunction4February 27, 2008 http://csg.csail.mit.edu/6.375 L09-7BypassesFebruary 27, 2008L09-8http://csg.csail.mit.edu/6.375Bypassing will affect ...The newIt function: After decoding it must read the new register values if available (i.e., the values that are still to be committed in the register file)The Stall function: The instruction fetch must not stall if the new value of the register to be read exists In our specific design we never stall because the new register value will be available5February 27, 2008L09-9http://csg.csail.mit.edu/6.375The bypassRF functionfunction bypassRF(r,tobeCommitted);case (tobeCommitted) matchestagged (Valid {.rd, .v} &&& (r==rd)): return (v);tagged Invalid: return (rf[r]);endcase;endfunctionFebruary 27, 2008L09-10http://csg.csail.mit.edu/6.375Modified Decode functionfunction InstrTemplate newItBy (instr,tobeCommitted); let bRf(x) = bypassRF(x, tobeCommitted); case (instr) matchestagged Add {dst:.rd,src1:.ra,src2:.rb}:return EAdd{dst:rd,op1:bRF(ra),op2:bRF(rb)};tagged Bz {cond:.rc,addr:.addr}:return EBz{cond:bRF(rc),addr:bRF(addr)};tagged Load {dst:.rd,addr:.addr}:return ELoad{dst:rd,addr:bRF(addr)};tagged Store{value:.v,addr:.addr}:return EStore{value:bRF(v),addr:bRF(addr)};endcase endfunctionReplace each registerfile read by function bypassRF(ra) which will return the newly written value if it exists6February 27, 2008L09-11http://csg.csail.mit.edu/6.375Synchronous Pipelinewith bypassingrule SyncTwoStage (True);let instr = iMem.read(pc); let stall = newstallfuncR(instr,buReg);let fetchAction(tobeCommitted) = action if(!stall) pc <= predIa;buReg <= (stall) ? Invalid : Valid newByIt(instr,tobeCommitted);endaction;case (buReg) matches …endcaseendcase endruleFebruary 27, 2008L09-12http://csg.csail.mit.edu/6.375Synchronous Pipeline with bypassingrule SyncTwoStage (True); …case (buReg) matches tagged Invalid: fetchAction(Invalid);tagged Valid .it: begincase (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: beginlet v = va + vb;rf.upd(rd,t); fetchAction(Valid tuple2(rd,v));end tagged EBz {cond:.cv,addr:.av}:if (cv == 0) then beginpc <= av; buReg <= Invalid; end else fetchAction(Invalid);tagged ELoad{dst:.rd,addr:.av}: beginlet v = dMem.read(av);rf.upd(rd,v); fetchAction(Valid tuple2(rd,v));endtagged EStore{value:.vv,addr:.av}: begindMem.write(av, vv); fetchAction(Invalid);endendcase endcase endrule7February 27, 2008L09-13http://csg.csail.mit.edu/6.375The New Stall Functionfunction Bool newstallfuncR (Instr instr, Reg#(Maybe#(InstTemplate)) buReg); case (buReg) matchestagged Invalid: return False;tagged Valid .it: case (instr) matchestagged Add {dst:.rd,src1:.ra,src2:.rb}: return (findf(ra,it) || findf(rb,it));…Previously we stalled when ra matched the destination register of the instruction in the execute stage. Now we bypass that information when we read, so no stall is necessary. return (false);February 27, 2008L09-14http://csg.csail.mit.edu/6.375The PlanTwo-stage synchronous pipeline Bypassing issuesTwo-stage asynchronous pipeline ⇐ Concurrency IssuesSome understanding of simple processor pipelines is needed to follow this lecture8February 27, 2008L09-15http://csg.csail.mit.edu/6.375Two-stage Pipelinerule fetch_and_decode (!stallfunc(instr, bu)); bu.enq(newIt(instr,rf));pc <= predIa;endrulerule execute (True);case (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: beginrf.upd(rd, va+vb); bu.deq(); endtagged EBz {cond:.cv,addr:.av}:if (cv == 0) then beginpc <= av; bu.clear(); end else bu.deq();tagged ELoad{dst:.rd,addr:.av}: beginrf.upd(rd, dMem.read(av)); bu.deq(); endtagged EStore{value:.vv,addr:.av}: begindMem.write(av, vv); bu.deq(); endendcase endrulefetch & decodeexecutepcrfCPUbuCan these rules fire concurrently ?Does it matter?February 27, 2008L09-16http://csg.csail.mit.edu/6.375The tensionIf the two rules never fire in the same cycle then the machine can hardly be called a pipelined machineIf both rules fire every cycle they are enabled, then wrong results would be produced9February 27, 2008L09-17http://csg.csail.mit.edu/6.375The compiler issueIn some situations correctness of the design is not enough; the design is not done unless the performance goals are metCan the compiler detect all
View Full Document