Slide 1The PlanInstruction setTagged Unions: Bit RepresentationNon-pipelined ProcessorNon-pipelined processor ruleSlide 7Two-stage Synchronous PipelineInstructions & TemplatesFetch & Decode Action Fills the buReg with a decoded instructionExecute Action: Reads buReg and modifies state (rf,dMem,pc)Issues with buRegSynchronous Pipeline first attemptExecutePipeline HazardsSynchronous Pipeline correctedThe Stall FunctionThe findf functionSynchronous PipelinesSlide 20Processor Pipelines and FIFOsSFIFO (glue between stages)Two-Stage PipelineRules for AddFetch & Decode Rule: ReexaminedFetch & Decode Rule: correctedRules for BranchFetch & Decode RuleThe Stall SignalSlide 30Execute RuleNext time -- BypassingFebruary 18, 2009 http://csg.csail.mit.edu/6.375 L07-1Modeling ProcessorsArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 18, 2009L07-2http://csg.csail.mit.edu/6.375The PlanNon-pipelined processorTwo-stage synchronous pipelineTwo-stage asynchronous pipelineSome understanding of simple processor pipelines is needed to follow this lectureFebruary 18, 2009L07-3http://csg.csail.mit.edu/6.375Instruction settypedef enum {R0;R1;R2;…;R31} RName;An instruction set can be implemented using many different microarchitecturestypedef union tagged { struct {RName dst; RName src1; RName src2;} Add; struct {RName cond; RName addr;} Bz; struct {RName dst; RName addr;} Load; struct {RName value; RName addr;} Store} Instr deriving(Bits, Eq);typedef Bit#(32) Iaddress;typedef Bit#(32) Daddress;typedef Bit#(32) Value;February 18, 2009L07-4http://csg.csail.mit.edu/6.375Tagged Unions: Bit Representation00 dst src1 src201 cond addr10 dst addr11 dst immtypedef union tagged { struct {RName dst; RName src1; RName src2;} Add; struct {RName cond; RName addr;} Bz; struct {RName dst; RName addr;} Load; struct {RName dst; Immediate imm;} AddImm;} Instr deriving(Bits, Eq);Automatically derived representation; can be customized by the user written pack and unpack functionsFebruary 18, 2009L07-5http://csg.csail.mit.edu/6.375Non-pipelined Processorfetch & execute pc iMem dMemrfCPUmodule mkCPU#(Mem iMem, Mem dMem)(); Reg#(Iaddress) pc <- mkReg(0); RegFile#(RName, Bit#(32)) rf <- mkRegFileFull();Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; rule fetch_Execute ...endmoduleFebruary 18, 2009L07-6http://csg.csail.mit.edu/6.375Non-pipelined processor rulerule fetch_Execute (True); case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: begin rf.upd(rd, rf[ra]+rf[rb]); pc <= predIa end tagged Bz {cond:.rc,addr:.ra}: begin pc <= (rf[rc]==0) ? rf[ra] : predIa; end tagged Load {dest:.rd,addr:.ra}: begin rf.upd(rd, dMem.read(rf[ra])); pc <= predIa; end tagged Store {value:.rv,addr:.ra}: begin dMem.write(rf[ra],rf[rv]); pc <= predIa; end endcaseendrulemy syntaxrf[r] rf.sub(r)Assume “magic memory”, i.e. responds to a read request in the same cycle and a write updates the memory at the end of the cyclePattern matchingFebruary 18, 2009L07-7http://csg.csail.mit.edu/6.375The PlanNon-pipelined processorTwo-stage synchronous pipeline Two-stage asynchronous pipelineFebruary 18, 2009L07-8http://csg.csail.mit.edu/6.375Two-stage SynchronousPipelinefetch & decodeexecutebuRegtime t0 t1 t2 t3 t4 t5 t6 t7 . . . .FDstage FD1FD2FD3FD4FD5EXstage EX1EX2EX3EX4EX5Actions to be performed in parallel every cycle:Fetch Action: Decodes the instruction at the current pc and fetches operands from the register file and stores the result in buRegExecute Action: Performs the action specified in buReg and updates the processor state (pc, rf, dMem)pc rf dMemFebruary 18, 2009L07-9http://csg.csail.mit.edu/6.375Instructions & Templatestypedef union tagged { struct {RName dst; Value op1; Value op2} EAdd; struct {Value cond; Iaddress tAddr} EBz; struct {RName dst; Daddress addr} ELoad; struct {Value data; Daddress addr} EStore;} InstTemplate deriving(Eq, Bits);typedef union tagged { struct {RName dst; RName src1; RName src2} Add; struct {RName cond; RName addr} Bz; struct {RName dst; RName addr} Load; struct {RName value; RName addr} Store;} Instr deriving(Bits, Eq);buReg contains instruction templates, i.e., decoded instructionsFebruary 18, 2009L07-10http://csg.csail.mit.edu/6.375Fetch & Decode ActionFills the buReg with a decoded instructionfunction InstrTemplate newIt(Instr instr); case (instr) matches tagged Add {dst:.rd,src1:.ra,src2:.rb}: return EAdd{dst:rd,op1:rf[ra],op2:rf[rb]}; tagged Bz {cond:.rc,addr:.addr}: return EBz{cond:rf[rc],addr:rf[addr]}; tagged Load {dst:.rd,addr:.addr}: return ELoad{dst:rd,addr:rf[addr]}; tagged Store{value:.v,addr:.addr}: return EStore{value:rf[v],addr:rf[addr]}; endcase endfunction buReg <= newIt(instr); no extra gates!February 18, 2009L07-11http://csg.csail.mit.edu/6.375Execute Action: Reads buReg and modifies state (rf,dMem,pc)case (buReg) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin rf.upd(rd, va+vb); pc <= predIa; end tagged ELoad{dst:.rd,addr:.av}: begin rf.upd(rd, dMem.read(av)); pc <= predIa; end tagged EStore{value:.vv,addr:.av}: begin dMem.write(av, vv); pc <= predIa; end tagged EBz {cond:.cv,addr:.av}: if (cv != 0) then pc <= predIa; else begin pc <= av; Invalidate buReg endendcaseWhat does this mean?February 18, 2009L07-12http://csg.csail.mit.edu/6.375Issues with buRegfetch & decodeexecutebuRegbuReg may not always contain an instruction. Why?start cycleExecute stage may kill the fetched instructions because of branch misprediction Maybe type to the rescue …Can’t update buReg in two concurrent actions fetchAction; executeAction Fold them togetherpc rf dMemFebruary 18, 2009L07-13http://csg.csail.mit.edu/6.375fetch & decodeexecutepcrfCPUbuRegSynchronousPipeline first attemptrule SyncTwoStage (True); let instr = iMem.read(pc); let predIa = pc+1; Action fetchAction = action buReg <= Valid newIt(instr); pc <= predIa; endaction; case (buReg) matches … calls fetchAction or puts Invalid in buReg … endcaseendcase endruleFebruary 18, 2009L07-14http://csg.csail.mit.edu/6.375Executecase (buReg) matches tagged Valid .it:
View Full Document