Slide 1Instruction setProcessorsNon-pipelined ProcessorNon-pipelined processor ruleProcessor Pipelines and FIFOsSFIFO (glue between stages)Two-Stage PipelineInstructions & TemplatesRules for AddFetch & Decode Rule: ReexaminedFetch & Decode Rule: correctedRules for BranchThe Stall SignalParameterization: The Stall FunctionThe findf functionFetch & Decode RuleFetch & Decode Rule another styleExecute RuleSearchable FIFOs and Concurrency IssuesOne-Element Searchable FIFOTwo-Element Searchable FIFOConcurrency requirementsIntra-Rule RequirementsInter-rule concurrency requirements: case analysisNumbering the methods worksheetNumbering the methodsMarch 10, 2005 http://csg.csail.mit.edu/6.375/ L12-1Bluespec-6: Modeling ProcessorsArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyMarch 10, 2006 L12-2http://csg.csail.mit.edu/6.375/Instruction settypedef enum {R0;R1;R2;…;R31} RName;An instruction set can be implemented using many different microarchitecturestypedef union tagged { struct {RName dst; RName src1; RName src2;} Add; struct {RName cond; RName addr;} Bz; struct {RName dst; RName addr;} Load; struct {RName value; RName addr;} Store} Instr deriving(Bits, Eq);typedef Bit#(32) Iaddress;typedef Bit#(32) Daddress;typedef Bit#(32) Value;March 10, 2006 L12-3http://csg.csail.mit.edu/6.375/ProcessorsNon-pipelined processorTwo-stage pipelinePerformance issuesMarch 10, 2006 L12-4http://csg.csail.mit.edu/6.375/Non-pipelined Processorfetch & execute pc iMem dMemrfCPUmodule mkCPU#(Mem iMem, Mem dMem)(); Reg#(Iaddress) pc <- mkReg(0); RegFile#(RName, Bit#(32)) rf <- mkRegFileFull();Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; rule fetch_Execute ...endmoduleMarch 10, 2006 L12-5http://csg.csail.mit.edu/6.375/Non-pipelined processor rulerule fetch_Execute (True); case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: begin rf.upd(rd, rf[ra]+rf[rb]); pc <= predIa end tagged Bz {cond:.rc,addr:.ra}: begin pc <= (rf[rc]==0) ? rf[ra] : predIa; end tagged Load {dest:.rd,addr:.ra}: begin rf.upd(rd, dMem.read(rf[ra])); pc <= predIa; end tagged Store {value:.rv,addr:.ra}: begin dMem.write(rf[ra],rf[rv]); pc <= predIa; end endcaseendrulemy syntaxrf[r] rf.sub(r)Assume “magic memory”, i.e. responds to a read request in the same cycle and a write updates the memory at the end of the cycleMarch 10, 2006 L12-6http://csg.csail.mit.edu/6.375/Processor Pipelines and FIFOsfetchexecute iMemrfCPUdecodememorypcwrite-backdMemMarch 10, 2006 L12-7http://csg.csail.mit.edu/6.375/interface SFIFO#(type t, type tr); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Bool find(tr); // search FIFOendinterface SFIFO (glue between stages) n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr"not fullnot emptynot emptyrdyenabnnrdyenabrdyenqdeqfirstSFIFOmoduleclearenabfindmboolmore on searchable FIFOs laterMarch 10, 2006 L12-8http://csg.csail.mit.edu/6.375/Two-Stage Pipeline fetch & decodeexecutepcrfCPUbumodule mkCPU#(Mem iMem, Mem dMem)(Empty); Reg#(Iaddress) pc <- mkReg(0);RegFile#(RName, Bit#(32)) rf <- mkRegFileFull();SFIFO#(InstTemplate, RName) bu <- mkSFifo(findf); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; InstTemplate it = bu.first(); rule fetch_decode ...endmoduleMarch 10, 2006 L12-9http://csg.csail.mit.edu/6.375/Instructions & Templatestypedef union tagged { struct {RName dst; Value op1; Value op2} EAdd; struct {Value cond; Iaddress tAddr} EBz; struct {RName dst; Daddress addr} ELoad; struct {Value data; Daddress addr} EStore;} InstTemplate deriving(Eq, Bits);typedef union tagged { struct {RName dst; RName src1; RName src2} Add; struct {RName cond; RName addr} Bz; struct {RName dst; RName addr} Load; struct {RName value; RName addr} Store;} Instr deriving(Bits, Eq);typedef Bit#(32) Iaddress;typedef Bit#(32) Daddress;typedef Bit#(32) Value;March 10, 2006 L12-10http://csg.csail.mit.edu/6.375/Rules for Add rule decodeAdd(instr matches Add{dst:.rd,src1:.ra,src2:.rb}) bu.enq (EAdd{dst:rd,op1:rf[ra],op2:rf[rb]}); pc <= predIa;endrulerule executeAdd(it matches EAdd{dst:.rd,op1:.va,op2:.vb}) rf.upd(rd, va + vb); bu.deq();endruleimplicit check:implicit check:fetch & decodeexecutepcrfCPUbubu notfullbu notemptyMarch 10, 2006 L12-11http://csg.csail.mit.edu/6.375/Fetch & Decode Rule: Reexamined Wrong! Because instructions in bu may be modifying ra or rbstall !fetch & decodeexecutepcrfCPUburule decodeAdd (instr matches Add{dst:.rd,src1:.ra,src2:.rb}) bu.enq (EAdd{dst:rd, op1:rf[ra], op2:rf[rb]}); pc <= predIa;endruleMarch 10, 2006 L12-12http://csg.csail.mit.edu/6.375/Fetch & Decode Rule: correctedfetch & decodeexecutepcrfCPUburule decodeAdd (instr matches Add{dst:.rd,src1:.ra,src2:.rb} bu.enq (EAdd{dst:rd, op1:rf[ra], op2:rf[rb]}); pc <= predIa;endrule&&& !bu.find(ra) &&& !bu.find(rb))March 10, 2006 L12-13http://csg.csail.mit.edu/6.375/Rules for Branch rule decodeBz(instr matches Bz{cond:.rc,addr:.addr}) &&& !bu.find(rc) &&& !bu.find(addr)); bu.enq (EBz{cond:rf[rc],addr:rf[addr]}); pc <= predIa; endrulerule bzTaken(it matches EBz{cond:.vc,addr:.va}) &&& (vc==0)); pc <= va; bu.clear(); endrulerule bzNotTaken (it matches EBz{cond:.vc,addr:.va}) &&& (vc != 0)); bu.deq; endrulefetch & decodeexecutepcrfCPUburule-atomicity ensures thatpc update, anddiscard of pre-fetched instrs in bu, are doneconsistentlyMarch 10, 2006 L12-14http://csg.csail.mit.edu/6.375/The Stall SignalBool stall = case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: return (bu.find(ra) || bu.find(rb)); tagged Bz {cond:.rc,addr:.addr}: return (bu.find(rc) || bu.find(addr)); tagged Load {dst:.rd,addr:.addr}: return (bu.find(addr)); tagged Store {value:.v,addr:.addr}: return (bu.find(v)) || bu.find(addr)); endcase;Need to extend the fifo interface with the “find” method where “find” searches the fifo using the findf functionMarch 10, 2006 L12-15http://csg.csail.mit.edu/6.375/Parameterization: The Stall Functionfunction Bool stallfunc (Instr instr,
View Full Document