Review: Datapath with Data Hazard ControlControl HazardsSlide 4Jumps Incur One StallSupporting ID Stage JumpsTwo “Types” of StallsReview: Branches Incur Three StallsMoving Branch Decisions Earlier in PipeID Branch Forwarding IssuesID Branch Forwarding Issues, con’tSupporting ID Stage BranchesDelayed DecisionScheduling Branch Delay SlotsStatic Branch PredictionSlide 17Branching StructuresStatic Branch Prediction, con’tDynamic Branch PredictionBranch Target Buffer1-bit Prediction AccuracySlide 24Dealing with ExceptionsTwo Types of ExceptionsSlide 28Slide 30Additions to MIPS to Handle Exceptions (Fig 6.42)SummaryControl Hazards.1Review: Datapath with Data Hazard ControlReadAddressInstructionMemoryAddPC4Write DataRead Addr 1Read Addr 2Write AddrRegisterFileRead Data 1Read Data 216 32ALUShiftleft 2AddDataMemoryAddressWrite DataReadDataIF/IDSignExtendID/EXEX/MEMMEM/WBControlALUcntrlBranchPCSrcForwardUnitHazardUnit01ID/EX.RegisterRt0ID/EX.MemReadPC.WriteIF/ID.WriteControl Hazards.2Control HazardsWhen the flow of instruction addresses is not sequential (i.e., PC = PC + 4); incurred by change of flow instructionsConditional branches (beq, bne)Unconditional branches (j, jal, jr)ExceptionsPossible approachesStall (impacts CPI)Move decision point as early in the pipeline as possible, thereby reducing the number of stall cyclesDelay decision (requires compiler support)Predict and hope for the best !Control hazards occur less frequently than data hazards, but there is nothing as effective against control hazards as forwarding is for data hazardsControl Hazards.4Datapath Branch and Jump HardwareID/EXReadAddressInstructionMemoryAddPC4Write DataRead Addr 1Read Addr 2Write AddrRegisterFileRead Data 1Read Data 216 32ALUDataMemoryAddressWrite DataReadDataIF/IDSignExtendEX/MEMMEM/WBControlALUcntrlForwardUnitBranchPCSrcShiftleft 2AddShiftleft 2JumpPC+4[31-28]Control Hazards.5flushJumps Incur One StallInstr.Orderjj targetALUIMRegDM RegALUIMRegDM RegFortunately, jumps are very infrequent – only 3% of the SPECint instruction mixJumps not decoded until ID, so one flush is neededFix jump hazard by waiting – stall – but affects CPIALUIMRegDM RegControl Hazards.6Supporting ID Stage JumpsID/EXReadAddressInstructionMemoryAddPC4Write DataRead Addr 1Read Addr 2Write AddrRegisterFileRead Data 1Read Data 216 32ALUDataMemoryAddressWrite DataReadDataIF/IDSignExtendEX/MEMMEM/WBControlALUcntrlForwardUnitBranchPCSrcShiftleft 2AddShiftleft 2JumpPC+4[31-28]0Control Hazards.7Two “Types” of StallsNoop instruction (or bubble) inserted between two instructions in the pipeline (as done for load-use situations)Keep the instructions earlier in the pipeline (later in the code) from progressing down the pipeline for a cycle (“bounce” them in place with write control signals)Insert noop by zeroing control bits in the pipeline register at the appropriate stageLet the instructions later in the pipeline (earlier in the code) progress normally down the pipelineFlushes (or instruction squashing) were an instruction in the pipeline is replaced with a noop instruction (as done for instructions located sequentially after j instructions)Zero the control bits for the instruction to be flushedControl Hazards.8flushflushflushReview: Branches Incur Three StallsInstr.OrderbeqALUIMRegDM Regbeq targetALUIMRegDM RegFix branch hazard by waiting – stall – but affects CPIControl Hazards.9Moving Branch Decisions Earlier in PipeMove the branch decision hardware back to the EX stageReduces the number of stall (flush) cycles to twoAdds an and gate and a 2x1 mux to the EX timing pathAdd hardware to compute the branch target address and evaluate the branch decision to the ID stageReduces the number of stall (flush) cycles to one (like with jumps)-But now need to add forwarding hardware in ID stageComputing branch target address can be done in parallel with RegFile read (done for all instructions – only used when needed)Comparing the registers can’t be done until after RegFile read, so comparing and updating the PC adds a mux, a comparator, and an and gate to the ID timing pathFor deeper pipelines, branch decision points can be even later in the pipeline, incurring more stallsControl Hazards.10ID Branch Forwarding IssuesMEM/WB “forwarding” is taken care of by the normal RegFile write before read operationWB add3 $1,MEM add2 $3,EX add1 $4,ID beq $1,$2,LoopIF next_seq_instrNeed to forward from the EX/MEM pipeline stage to the ID comparison hardware for cases likeWB add3 $3,MEM add2 $1,EX add1 $4,ID beq $1,$2,LoopIF next_seq_instrif (IDcontrol.Branchand (EX/MEM.RegisterRd != 0)and (EX/MEM.RegisterRd = IF/ID.RegisterRs))ForwardC = 1if (IDcontrol.Branchand (EX/MEM.RegisterRd != 0)and (EX/MEM.RegisterRd = IF/ID.RegisterRt))ForwardD = 1Forwards the result from the second previous instr. to either input of the compareControl Hazards.11ID Branch Forwarding Issues, con’tIf the instruction immediately before the branch produces one of the branch source operands, then a stall needs to be inserted (between the beq and add1) since the EX stage ALU operation is occurring at the same time as the ID stage branch compare operationWB add3 $3,MEM add2 $4,EX add1 $1,ID beq $1,$2,LoopIF next_seq_instr“Bounce” the beq (in ID) and next_seq_instr (in IF) in place (ID Hazard Unit deasserts PC.Write and IF/ID.Write) Insert a stall between the add in the EX stage and the beq in the ID stage by zeroing the control bits going into the ID/EX pipeline register (done by the ID Hazard Unit)If the branch is found to be taken, then flush the instruction currently in IF (IF.Flush)Control Hazards.12Supporting ID Stage BranchesReadAddressInstructionMemoryPC4Write DataRead Addr 1Read Addr 2Write AddrRegFileRead Data 1ReadData 21632ALUShiftleft 2AddDataMemoryAddressWrite DataRead DataIF/IDSignExtendID/EXEX/MEMMEM/WBControlALUcntrlBranchPCSrcForwardUnitHazardUnitCompareForwardUnitAddIF.Flush0010Control Hazards.13Delayed DecisionIf the branch hardware has been moved to the ID stage, then we can eliminate all branch stalls with delayed branches which are defined as always executing the next sequential instruction after the branch instruction – the branch takes effect after that next instructionMIPS compiler moves an instruction to immediately
View Full Document