Slide 1Elastic pipeline Use FIFOs instead of pipeline registersSlide 3Guarded Atomic Actions (GAA): Execution modelsome insight into Concurrent rule firingParallel execution reorders reads and writesCorrectnessSlide 8Rule: As a State TransformerExecuting Multiple Rules Per Cycle: Conflict-free rulesMutually Exclusive RulesSlide 12Compiler determines if two rules can be executed in parallelConflicting rulesThe compiler issueConcurrency in Elastic pipelineSlide 17One-Element FIFOTwo-Element FIFOTwo-Element FIFO AnalysisTwo-Element FIFO Analysis cont.Two-Element FIFO a “more optimized” versionSlide 23RWire to the rescueOne-Element Pipeline FIFOOne-Element Pipeline FIFO AnalysisSolution- Config registers Lie a littleOne-Element Pipeline FIFO A correct solutionFIFOsAn aside Unsafe modulesTakeawaySlide 32Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence LabMassachusetts Institute of TechnologyFebruary 16, 2011 L05-1http://csg.csail.mit.edu/6.375Elastic pipelineUse FIFOs instead of pipeline registersxfifo1inQf1 f2 f3fifo2 outQrule stage1 (True); fifo1.enq(f1(inQ.first()); inQ.deq(); endrulerule stage2 (True); fifo2.enq(f2(fifo1.first()); fifo1.deq(); endrulerule stage3 (True); outQ.enq(f3(fifo2.first()); fifo2.deq(); endruleCan all three rules fire concurrently?February 16, 2011L05-2http://csg.csail.mit.edu/6.375Concurrency analysis and rule schedulingFebruary 16, 2011 L05-3http://csg.csail.mit.edu/6.375Guarded Atomic Actions (GAA):Execution modelRepeatedly:Select a rule to execute Compute the state updates Make the state updatesHighly non-deterministicImplementation concern: Schedule multiple rules concurrently without violating one-rule-at-a-time semanticsUser annotations can help in rule selectionFebruary 16, 2011L05-4http://csg.csail.mit.edu/6.375some insight intoConcurrent rule firingThere are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges RulesHWRi Rj RkclocksrulestepsRiRjRkFebruary 16, 2011L05-5http://csg.csail.mit.edu/6.375Parallel executionreorders reads and writesIn the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocksRulesHWclocksrulestepsreads writes reads writes reads writesreads writesreads writesreads writes reads writesFebruary 16, 2011L05-6http://csg.csail.mit.edu/6.375CorrectnessRules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semanticsRulesHWRi Rj RkclocksrulestepsRiRjRkFebruary 16, 2011L05-7http://csg.csail.mit.edu/6.375A compiler can determine if two rules can be executed in parallel without violating the one-rule-at-a-time semanticsJames Hoe, Ph.D., 2000February 16, 2011 L05-8http://csg.csail.mit.edu/6.375Rule: As a State TransformerA rule may be decomposed into two parts p(s) and d(s) such thatsnext = if p(s) then d(s) else sp(s) is the condition (predicate) of the rule, a.k.a. the “CAN_FIRE” signal of the rule. p is a conjunction of explicit and implicit conditionsd(s) is the “state transformation” function, i.e., computes the next-state values from the current state valuesFebruary 16, 2011L05-9http://csg.csail.mit.edu/6.375Executing Multiple Rules Per Cycle: Conflict-free rulesParallel execution behaves like ra < rb or equivalently rb < rarule ra (z > 10); x <= x + 1; endrulerule rb (z > 20); y <= y + 2; endrulerule ra_rb; if (z>10) then x <= x+1; if (z>20) then y <= y+2; endruleParallel Execution can also be understood in terms of a composite ruleRulea and Ruleb are conflict-free ifs . pa(s) pb(s) 1. pa(db(s)) pb(da(s)) 2. da(db(s)) == db(da(s)) February 16, 2011L05-10http://csg.csail.mit.edu/6.375Mutually Exclusive RulesRulea and Ruleb are mutually exclusive if they can never be enabled simultaneouslys . pa(s) ~ pb(s) Mutually-exclusive rules are Conflict-free by definitionFebruary 16, 2011L05-11http://csg.csail.mit.edu/6.375Executing Multiple Rules Per Cycle: Sequentially Composable rulesrule ra (z > 10); x <= y + 1; endrulerule rb (z > 20); y <= y + 2; endruleParallel execution behaves like ra < rbParallel Execution can also be understood in terms of a composite ruleRulea and Ruleb are sequentially composable ifs . pa(s) pb(s) 1. pb(da(s)) 2. PrjR(rb)(db(s)) == PrjR(rb)(db(da(s)))rule ra_rb; if (z>10) then x <= y+1; if (z>20) then y <= y+2; endrule- R(rb) is the range of rule rb- Prjst is the projection selecting st from the total stateFebruary 16, 2011L05-12http://csg.csail.mit.edu/6.375Compiler determines if two rules can be executed in parallelRulea and Ruleb are sequentially composable ifs . pa(s) pb(s) 1. pb(da(s)) 2. PrjR(Rb)(db(s)) == PrjR(Rb)(db(da(s)))Rulea and Ruleb are conflict-free ifs . pa(s) pb(s) 1. pa(db(s)) pb(da(s))2. da(db(s)) == db(da(s)) These properties can be determined by examining the domains and ranges of the rules in a pairwise manner.Parallel execution of CF and SC rules does not increase the critical path delay D(Ra) R(Rb) = D(Rb) R(Ra) = R(Ra) R(Rb) = D(Rb) R(Ra) = These conditions are sufficient but not necessaryFebruary 16, 2011L05-13http://csg.csail.mit.edu/6.375Conflicting rulesConcurrent execution of these can produce x=1 and y=2 but these values cannot be produced by any sequential executionra followed by rb would produce x=1 and y=3rb followed by ra would produce x=3 and y=2Such rules must be executed one-by-one and not concurrentlyrule ra (True); x <= y + 1; endrulerule rb (True); y <= x + 2; endruleAssume x and y are initially zeroFebruary 16, 2011L05-14http://csg.csail.mit.edu/6.375The compiler issueCan the compiler detect all the conflicting conditions?Important for correctnessDoes the compiler detect conflicts that do not exist in reality?False positives lower the performanceThe main reason is that sometimes the compiler cannot detect under what conditions the two rules are mutually exclusive or conflict freeWhat can the user specify easily?Rule priorities to resolve nondeterministic choiceyesyesIn many situations the correctness of the design is not enough; the design is not done unless the performance goals are metFebruary 16, 2011L05-15http://csg.csail.mit.edu/6.375Concurrency in Elastic pipelinexfifo1inQf1
View Full Document