DOC PREVIEW
Berkeley COMPSCI 152 - Lecture Notes

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architecture andEngineering Lecture 14 - Branch PredictionKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California at Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.e du/~cs1523/20/2008 CS152-Spring!082Last time in Lecture 13• Register renaming removes WAR, WAW hazards bygiving a new internal destination register for everynew result• Pipeline is structured with in-orderfetch/decode/rename, followed by out-of-orderexecution/complete, followed by in-order commit• At commit time, can detect exceptions and roll backbuffer to provide precise interrupts3/20/2008 CS152-Spring!083Recap: Overall Pipeline Structure• Instructions fetched and decoded into instruction reorder buffer in-order• Execution is out-of-order ( ! out-of-order completion)• Commit (write-back to architectural state, i.e., regfile & memory) is in-orderTemporary storage needed to hold results before commit(shadow registers and store buffers)Fetch DecodeExecuteCommitReorder BufferIn-order In-orderOut-of-orderKillKillKillException?Inject handler PC3/20/2008 CS152-Spring!084I-cacheFetchBufferIssueBufferFunc.UnitsArch.StateExecuteDecodeResultBufferCommitPCFetchBranchexecutedNext fetchstartedModern processors may have> 10 pipeline stages betweennext PC calculation and branchresolution !Control Flow PenaltyHow much work is lost ifpipeline doesn’t followcorrect instruction flow?~ Loop length x pipeline width3/20/2008 CS152-Spring!085Instruction Taken known? Target known?JJRBEQZ/BNEZMIPS Branches and JumpsEach instruction fetch depends on one or two piecesof information from the preceding instruction:1) Is the preceding instruction a taken branch?2) If so, what is the target address?After Inst. DecodeAfter Inst. Decode After Inst. DecodeAfter Inst. Decode After Reg. FetchAfter Reg. Fetch**Assuming zero detect on register read3/20/2008 CS152-Spring!086Branch Penalties in Modern PipelinesA PC Generation/MuxP Instruction Fetch Stage 1F Instruction Fetch Stage 2B Branch Address Calc/Begin DecodeI Complete DecodeJ Steer Instructions to Functional unitsR Register File ReadE Integer ExecuteRemainder of execute pipeline (+ another 6 stages)UltraSPARC-III instruction fetch pipeline stages(in-order issue, 4-way superscalar, 750MHz, 2000)BranchTargetAddressKnownBranchDirection &JumpRegisterTargetKnown3/20/2008 CS152-Spring!087Reducing Control Flow PenaltySoftware solutions• Eliminate branches - loop unrollingIncreases the run length• Reduce resolution time - instruction schedulingCompute the branch condition as earlyas possible (of limited value)Hardware solutions• Find something else to do - delay slotsReplaces pipeline bubbles with useful work(requires software cooperation)• Speculate - branch predictionSpeculative execution of instructions beyondthe branch3/20/2008 CS152-Spring!088Branch PredictionMotivation:Branch penalties limit performance of deeply pipelinedprocessorsModern branch predictors have high accuracy(>95%) and can reduce branch penalties significantlyRequired hardware support:Prediction structures:• Branch history tables, branch target buffers, etc.Mispredict recovery mechanisms:• Keep result computation separate from commit• Kill instructions following branch in pipeline• Restore state to state following branch3/20/2008 CS152-Spring!089Static Branch PredictionOverall probability a branch is taken is ~60-70% but:ISA can attach preferred direction semantics to branches,e.g., Motorola MC88110bne0 (preferred taken) beq0 (not taken)ISA can allow arbitrary choice of statically predicteddirection, e.g., HP PA-RISC, Intel IA-64 typically reported as ~80% accurateJZJZbackward90%forward50%3/20/2008 CS152-Spring!0810Dynamic Branch Predictionlearning based on past behaviorTemporal correlationThe way a branch resolves may be a goodpredictor of the way it will resolve at the nextexecutionSpatial correlationSeveral branches may resolve in a highlycorrelated manner (a preferred path ofexecution)3/20/2008 CS152-Spring!0811• Assume 2 BP bits per instruction• Change the prediction after two consecutive mistakes!¬takewrongtaken¬ takentakentakentaken¬takerighttakerighttakewrong¬ taken¬ taken¬ takenBP state:(predict take/¬take) x (last prediction right/wrong)Branch Prediction Bits3/20/2008 CS152-Spring!0812Branch History Table4K-entry BHT, 2 bits/entry, ~80-90% correct predictions0 0Fetch PCBranch?Target PC+I-CacheOpcode offsetInstructionkBHT Index2k-entryBHT,2 bits/entryTaken/¬Taken?3/20/2008 CS152-Spring!0813Exploiting Spatial CorrelationYeh and Patt, 1992History register, H, records the direction of the lastN branches executed by the processorif (x[i] < 7) theny += 1;if (x[i] < 5) thenc -= 4;If first condition false, second condition also false3/20/2008 CS152-Spring!0814Two-Level Branch PredictorPentium Pro uses the result from the last two branchesto select one of the four sets of BHT bits (~95% correct)0 0kFetch PCShift inTaken/¬Takenresults of eachbranch2-bit global branchhistory shift registerTaken/¬Taken?3/20/2008 CS152-Spring!0815Limitations of BHTsOnly predicts branch direction. Therefore, cannot redirectfetch stream until after branch target is determined.UltraSPARC-III fetch pipelineCorrectlypredictedtaken branchpenaltyJump RegisterpenaltyA PC Generation/MuxP Instruction Fetch Stage 1F Instruction Fetch Stage 2B Branch Address Calc/Begin DecodeI Complete DecodeJ Steer Instructions to Functional unitsR Register File ReadE Integer ExecuteRemainder of execute pipeline(+ another 6 stages)3/20/2008 CS152-Spring!0816Branch Target BufferBP bits are stored with the predicted target address.IF stage: If (BP=taken) then nPC=target else nPC=PC+4later: check prediction, if wrong then kill the instruction and update BTB & BPb else update BPbIMEMPCBranch Target Buffer (2k entries)kBPbpredictedtarget BP target3/20/2008 CS152-Spring!0817Address CollisionsWhat will be fetched after the instruction at 1028?BTB prediction = Correct target =! Assume a 128-entry BTBBPbtargettake2361028 Add .....132 Jump 100InstructionMemory2361032kill PC=236 and fetch PC=1032Is this a common occurrence?Can we avoid these bubbles?3/20/2008 CS152-Spring!0818BTB is only for Control InstructionsBTB contains useful information for branch andjump instructions only! Do not update it for other instructionsFor all other instructions the next PC is PC+4 !How to achieve this effect without decoding theinstruction?3/20/2008


View Full Document

Berkeley COMPSCI 152 - Lecture Notes

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?