Unformatted text preview:

CMSC 611: AdvancedCMSC 611: AdvancedComputer ArchitectureComputer ArchitecturePipeliningPipeliningSome material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slidesSome material adapted from Hennessy & Patterson / © 2003 Elsevier SciencePipeline HazardsPipeline Hazards• Cases that affect instruction executionsemantics and thus need to be detected and corrected• Hazards types– Structural hazard: attempt to use a resource two differentways at same time• Single memory for instruction and data– Data hazard: attempt to use item before it is ready• Instruction depends on result of prior instruction still in thepipeline– Control hazard: attempt to make a decision before condition isevaluated• branch instructions• Hazards can always be resolved by waitingInstr.OrderTime (clock cycles)LoadInstr 1Instr 2StallInstr 3RegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5RegALUDMemIfetchRegBubble Bubble Bubble BubbleBubbleDetecting and ResolvingDetecting and ResolvingStructural HazardStructural HazardSlide: David Culler! Pipelining Speedup =Average instruction time unpipelinedAverage instruction time pipelined=CPI unpipelinedCPI pipelined"Clock cycle unpipelinedClock cycle pipelined ! Speedup =CPI unpipelined1 + Pipeline stall cycles per instruction"Clock cycle unpipelinedClock cycle pipelinedStalls & Pipeline PerformanceStalls & Pipeline Performance ! CPI pipelined = Ideal CPI+ Pipeline stall cycles per instruction= 1+ Pipeline stall cycles per instruction ! Ideal CPI pipelined = 1 ! Speedup =Pipeline depth1 + Pipeline stall cycles per instructionAssuming all pipeline stages are balancedInstr.Orderadd r1,r2,r3sub r4,r1,r3and r6,r1,r7or r8,r1,r9xor r10,r1,r11RegALUDMemIfetchRegData HazardsData HazardsTime (clock cycles)IF ID/RFEXMEMWBRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegSlide: David CullerRegALUDMemIfetchRegI: add r1,r2,r3J: sub r4,r1,r3Three Generic Data HazardsThree Generic Data Hazards• Read After Write (RAW)InstrJ tries to read operand before InstrI writesit• Caused by a “Data Dependence” (in compilernomenclature). This hazard results from anactual need for communication.Slide: David Culler• Write After Read (WAR)InstrJ writes operand before InstrI reads it• Called an “anti-dependence” in compilers.– This results from reuse of the name “r1”.• Can’t happen in MIPS 5 stage pipeline because:– All instructions take 5 stages, and– Reads are always in stage 2, and– Writes are always in stage 5I: sub r4,r1,r3J: add r1,r2,r3K: mul r6,r1,r7Three Generic Data HazardsThree Generic Data HazardsSlide: David Culler• Write After Write (WAW)InstrJ writes operand before InstrI writes it.• Called an “output dependence” in compilers– This also results from the reuse of name “r1”.• Can’t happen in MIPS 5 stage pipeline:– All instructions take 5 stages, and– Writes are always in stage 5• Do see WAR and WAW in more complicated pipesI: mul r1,r4,r3J: add r1,r2,r3K: sub r6,r1,r7Three Generic Data HazardsThree Generic Data HazardsSlide: David CullerTime (clock cycles)Instr.Orderadd r1,r2,r3sub r4,r1,r3and r6,r1,r7or r8,r1,r9xor r10,r1,r11RegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegForwarding to Avoid DataForwarding to Avoid DataHazardHazardSlide: David CullerHW Change for ForwardingHW Change for ForwardingMEM/WRID/EXEX/MEM DataMemoryALUmux muxRegistersNextPCImmediatemuxSlide: David CullerTime (clock cycles)Instr.Orderlw r1, 0(r2)sub r4,r1,r6and r6,r1,r7or r8,r1,r9Data Hazard Even withData Hazard Even withForwardingForwardingRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegSlide: David CullerResolving Load HazardsResolving Load Hazards• Adding hardware? How? Where?• Detection?• Compilation techniques?• What is the cost of load delays?Slide: David CullerResolving the Load DataResolving the Load DataHazardHazardTime (clock cycles)or r8,r1,r9Instr.Orderlw r1, 0(r2)sub r4,r1,r6and r6,r1,r7RegALUDMemIfetchRegIfetchALUDMemBubbleRegRegIfetchALUDMemRegBubbleIfetchALUDMemRegBubbleRegHow is this different from the instruction issue stall?Slide: David CullerTry producing fast code fora = b + c;d = e – f;assuming a, b, c, d ,e, and f in memory.Slow code:LW Rb,bLW Rc,cADD Ra,Rb,RcSW a,RaLW Re,eLW Rf,fSUB Rd,Re,RfSW d,RdFast code:LW Rb,bLW Rc,cLW Re,eADD Ra,Rb,RcLW Rf,fSW a,RaSUB Rd,Re,RfSW d,RdSoftware Scheduling to AvoidSoftware Scheduling to AvoidLoad HazardsLoad HazardsSlide: David CullerInstruction Set ConnectionInstruction Set Connection• What is exposed about this organizational hazard in theinstruction set?• k cycle delay?– bad, CPI is not part of ISA• k instruction slot delay– load should not be followed by use of the value in the next kinstructions• Nothing, but code can reduce run-time delays• MIPS did the transformation in the assemblerSlide: David


View Full Document

UMBC CMSC 611 - Pipelining

Download Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?