DOC PREVIEW
UT CS 429H - Pipelining V

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Pipelining VTopicsTopics Branch prediction State machine designSystems I2Branch PredictionUntil now - we have assumed a Until now - we have assumed a ““predict takenpredict taken”” strategy strategyfor conditional branchesfor conditional branches Compute new branch target and begin fetching from there If prediction is incorrect, flush pipeline and begin refetchingHowever, there are other strategiesHowever, there are other strategies Predict not-taken Combination (quasi-static) Predict taken if branch backward (like a loop) Predict not taken if branch forward3Branching StructuresPredict not taken works well for Predict not taken works well for ““top of the looptop of the loop””branching structuresbranching structuresLoop: cmpl %eax, %edx je Out 1nd loop instr . . last loop instr jmp LoopOut: fall out instr But such loops have jumps atthe bottom of the loop to returnto the top of the loop – andincur the jump stall overheadPredict not taken doesnPredict not taken doesnʼʼt work well for t work well for ““bottom of thebottom of thelooploop”” branching structures branching structuresLoop: 1st loop instr 2nd loop instr . . last loop instr cmpl %eax, %edx jne Loop fall out instr4Branch Prediction AlgorithmsStatic Branch PredictionStatic Branch Prediction Prediction (taken/not-taken) either assumed or encoded intoprogramDynamic Branch PredictionDynamic Branch Prediction Uses forms of machine learning (in hardware) to predictbranches Track branch behavior Past history of individual branches Learn branch biases Learn patterns and correlations between different branches Can be very accurate (95% plus) as compared to less than90% for static5IMPCBHTIRPredictionupdateSimple Dynamic PredictorPredict branch based on pastPredict branch based on pasthistory of branchhistory of branchBranch history tableBranch history table Indexed by PC (or fraction ofit) Each entry stores lastdirection that indexedbranch went (1 bit to encodetaken/not-taken) Table is a cache of recentbranches Buffer size of 4096 entriesare common (track 4Kdifferent branches)6Multi-bit predictorsA A ʻʻpredict same as lastpredict same as lastʼʼ strategy strategygets two mispredicts on each loopgets two mispredicts on each loop Predict NTTT…TTT Actual TTTT…TTNCan do much better by addingCan do much better by addinginertiainertia to the predictor to the predictor e.g., two-bit saturating counter Predict TTTT…TTT Use two bits to encode: Strongly taken (T2) Weakly taken (T1) Weakly not-taken (N1) Strongly not-taken (N2)for(j=0;j<30;j++) {for(j=0;j<30;j++) {……}}N2 N1 T1 T2T T TTNN N NState diagram to representing states and transitions7How do we build this in Hardware?This is a sequential logic circuit that can be formulated as a stateThis is a sequential logic circuit that can be formulated as a statemachinemachine 4 states (N2, N1, T1, T2) Transitions between the states based on action “b”General form of state machine:General form of state machine:N2 N1 T1 T2T T TTNN N NState Variables(Flip-flops)Comb.Logicinputs outputs8State Machine for Branch Predictor4 states - can encode in two state bits <S1, S0>4 states - can encode in two state bits <S1, S0> N2 = 00, N1 = 01, T1 = 10, T2 = 11 Thus we only need 2 storage bits (flip-flops in last slide)Input: b = 1 if last branch was taken, 0 if not takenInput: b = 1 if last branch was taken, 0 if not takenOutput:Output: pp = 1 if predict taken, 0 if predict not taken= 1 if predict taken, 0 if predict not takenNow - we just need combinational logic equations for:Now - we just need combinational logic equations for: p, S1new, S0new, based on b, S1, S09Combinational logic for statemachinep =1 if statep =1 if state isis T2 or T1T2 or T1thus p =thus p = S1 (according toS1 (according toencodings)encodings)The state variables S1, S0The state variables S1, S0are governed by the truthare governed by the truthtable that implements thetable that implements thestate diagramstate diagram S1new = S1*S0 + S1*b + S0*b S0new = S1*S0ʼ + S0ʼ*S1ʼ*b +S0*S1*b1111111100000000pp1100111100001100S0S0newnew1111111100000000S1S1111111110011111100000000S1S1newnewbbS0S011111100001100110000000010IMPCBHTIRPredictionupdateEnhanced Dynamic PredictorReplace simple table of 1 bitReplace simple table of 1 bithistories with table of 2 bit statehistories with table of 2 bit statebitsbitsState transition logic can beState transition logic can beshared across all entries inshared across all entries intabletable Read entry out Apply combinational logic Write updated state bitsback into table11YMSBPYet more sophisticated branch predictorsYet more sophisticated branch predictorsPredictors that recognize patternsPredictors that recognize patterns eg. if last three instances of a given branches were NTN, thenpredict takenPredictors that correlate between multiple branchesPredictors that correlate between multiple branches eg. if the last three instances of any branch were NTN, then predicttakenPredictors that correlatePredictors that correlate weight differentweight different past branches differentlypast branches differently e.g. if the branches 1, 4, and 8 ago were NTN, then predict takenHybrid predictors that are composed of multiple differentHybrid predictors that are composed of multiple differentpredictorspredictors e.g. two different predictors run in parallel and a third predictorpredicts which one to useMore sophisticated learning algorithmsMore sophisticated learning algorithms12SummaryTodayToday Branch mispredictions cost a lot in performance CPU Designers willing to go to great lengths to improveprediction accuracy Predictors are just state machines that can be designedusing combinational logic and


View Full Document

UT CS 429H - Pipelining V

Download Pipelining V
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pipelining V and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pipelining V 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?