Slide 1Dealing with ConflictsSFIFOProcessor ExampleDecode Rulesome insight into Concurrent rule firingParallel execution reorders reads and writesCorrectnessUpshotSlide 10Processor: ConcurrenciesConcurrency requirements for Full Pipelining – Reg FileConcurrency requirements for Full Pipelining – FIFOsSlide 14From Analysis to DesignConfigReg and RWireSlide 17Processor ReduxConcurrency: RegFileBypassRegFileProcessor ReduxOne Element SFIFO (Naïve)One Element SFIFO (In-Order d2eQ #1)One Element SFIFO (In-Order e2mQ, m2wQ #2)One Element Searchable SFIFO (Pipelined #3)One Element Searchable SFIFO (Pipelined #4)One Element Searchable SFIFO (Pipelined #4)Slide 28Counter Module InterfaceNaïve Counter ExampleCounter ExampleSlide 32Completion buffer: InterfaceIP-Lookup module with the completion bufferIP Lookup rules with completion bufferNaïve Completion BufferCompletion buffer: Interface RequirementsCompletion BufferMarch 9, 2011 http://csg.csail.mit.edu/6.375 L11-1Implementing for Correct ConcurrencyNirav DaveComputer Science & Artificial Intelligence LabMassachusetts Institute of Technologyhttp://csg.csail.mit.edu/6.375March 9, 2011L11-2http://csg.csail.mit.edu/6.375Dealing with ConflictsWhen do conflicts arise?How do we Analyze them?How do we fix them?How do we make sure we’re okay?March 9, 2011L11-3http://csg.csail.mit.edu/6.375SFIFOinterface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFOendinterface n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr“ v = # of bits needed to represent the values of type “v“not fullnot emptynot emptyrdyenabnnrdyenabrdyenqdeqfirstSFIFOmoduleclearenabfindmboolVMarch 9, 2011L11-4http://csg.csail.mit.edu/6.375Processor Examplefetchexecute iMemrfCPUdecodememorypcwrite-backdMem5 – stage Processor. 1 element FIFOs in between stagesLet’s add bypassingMarch 9, 2011L11-5http://csg.csail.mit.edu/6.375Decode Rulerule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst); let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1); let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1); let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst);endruleWhen do we want it to execute?Decode is also correct correct anytime it’s allowed to execute Search through each place in designMarch 9, 2011L11-6http://csg.csail.mit.edu/6.375some insight intoConcurrent rule firingThere are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges RulesHWRi Rj RkclocksrulestepsRiRjRkhttp://csg.csail.mit.edu/6.375March 9, 2011L11-7http://csg.csail.mit.edu/6.375Parallel executionreorders reads and writesIn the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocksRulesHWclocksrulestepsreads writes reads writes reads writesreads writesreads writesreads writes reads writeshttp://csg.csail.mit.edu/6.375March 9, 2011L11-8http://csg.csail.mit.edu/6.375CorrectnessRules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semanticsRulesHWRi Rj RkclocksrulestepsRiRjRkhttp://csg.csail.mit.edu/6.375March 9, 2011L11-9http://csg.csail.mit.edu/6.375UpshotGiven the concurrency of method/rules in a system we can determine viable schedulesSome variation do to applicabilityBUT we know what schedule we want (mostly)We should be able to back propagate results to submodulesMarch 9, 2011L11-10http://csg.csail.mit.edu/6.375Determining Concurrency PropertiesMarch 9, 2011L11-11http://csg.csail.mit.edu/6.375Processor: ConcurrenciesIn-order: F < D < E < M < WPipelined W < M < E < D < Ffetchexecute iMemrfCPUdecodememorypcwrite-backdMemhttp://csg.csail.mit.edu/6.375March 9, 2011L11-12http://csg.csail.mit.edu/6.375Concurrency requirements for Full Pipelining – Reg FileIn-Order RF:(D calls sub) < (W calls upd)Pipelined RF:(W calls upd) < (D calls sub)fetchexecuteimemrfCPUdecodememorypcwrite-backdMemMarch 9, 2011L11-13http://csg.csail.mit.edu/6.375Concurrency requirements for Full Pipelining – FIFOsIn-Order FIFOs:1. m2wQ, e2mQ: find < enq < first < deq2. d2eQ: find < enq < first < deq, clearPipeline FIFOs:3. m2wQ, e2mQ : first < deq < enq < find4. d2eQ : first < deq < find < enqfetchexecuteimemrfCPUdecodememorypcwrite-backdMemMarch 9, 2011L11-14http://csg.csail.mit.edu/6.375Constructing Appropriately concurrent submodulesMarch 9, 2011L11-15http://csg.csail.mit.edu/6.375From Analysis to DesignWe need to create modules which behave as neededConstruct modules using “unsafe” primitives to have “safe” behaviorsThree major concepts:Use primitives which remove “false” concurrency orderings (e.g. ConfigRegs vs. Regs)Add RWires for forwarding values intra-cycleReason carefully to assure that execution appears “atomic”March 9, 2011L11-16http://csg.csail.mit.edu/6.375ConfigReg and RWiremkConfigReg is a Reg without this restrictionmkReg requires that read < writeAllows us to read stale values (dangerous)RWire is a “wire”wset :: a -> Action writeswget :: Maybe#(a) returns written value if read happened.wset happens before wget each cycleMarch 9, 2011L11-17http://csg.csail.mit.edu/6.375Let’s implement some modulesMarch 9, 2011L11-18http://csg.csail.mit.edu/6.375Processor ReduxIn-order: F < D < E < M < WPipelined W < M < E < D < Ffetchexecute iMemrfCPUdecodememorypcwrite-backdMemhttp://csg.csail.mit.edu/6.375March 9, 2011L11-19http://csg.csail.mit.edu/6.375Concurrency: RegFileThe standard library regfile is implemented using with concurrency (sub < upd)This handles the in-order caseWe need to build a RegisterFile for the pipelined caseMarch 9, 2011L11-20http://csg.csail.mit.edu/6.375BypassRegFilemodule mkBypassRegFile(RegFile#(a,d)) #(d l, d h) provisos#(Bits(a,asz), Bits#(d,dsz));
View Full Document