1Branch PredictionAlex Nizhner, Yang Gu2003-10-12Branch Prediction Overview 2-bit saturating counter (state machine) Correlated Branch Prediction– Global history Index table of state machines by low-order bits of the PC Many variations and approaches2003-10-13Papers Cliff Young, Nicolas Gloy, and Michael D. Smith. A Comparative Analysis of Schemes for Correlated Branch Prediction, Timothy Heil, Zak Smith, and James E Smith. Improving Branch Predictors by Correlating on Data Values, Joel Emer and Nikolas Gloy. A Language for Describing Predictors and its Application to Automatic SynthesisA Comparative Analysis of Schemes for Correlated Branch PredictionCliff Young, Nicolas Gloy, and Michael D.Smith22003-10-15Questions Why does branch prediction scheme A perform better than B? How well does a branch prediction scheme work?2003-10-16A Framework for Brach Prediction}1,0{var}1,0{),,(∈Ζ∈×Ζ∈=diabledirectionandbidentifieredbeexecutionbranchExecution Stream: a sequence of branch executions.Predictor: a simple mechanism that predicts the next direction of a stream. Prediction scheme: a comprehensive mechanism that takes a program execution stream, dividesit into substreams, and directs each substream to a unique predictor.2003-10-17The Keys The selections of the “right” divider; The selections of the “good” predictor.2003-10-18Existing divider mechanisms and their prediction schemes Identity function Per-branch substreamiiiibdbe →=),(32003-10-19Existing divider mechanisms and their prediction schemes (cont) Per-branch global-pattern streams– e.g. GA, gshare Per-branch branch-pattern streams Per-branch global-path streams2003-10-110The benefit of path history2003-10-111Why Static Correlated Prediction Works?2003-10-112Where can we improve:Paths versus Patterns42003-10-113Where can we improve:AliasingAliasing in a 4096 counter table for awk.aWhite squares represent unused counters; black squares represent counters with seven or more aliased streams. Gray scale indicates the degree of aliasing.2003-10-114Where can we improve:Cross-Procedure Correlation2003-10-115Conclusion Path history is slightly better than pattern history in exploiting branch correlation. Correlated dynamic branch prediction schemes utilize more 2-bit counters in their tables, but simultaneously increase the amount of aliasing. Cross procedure correlation limits the accuracy of static branch prediction schemes.A Language for Describing Predictors and its Application to Automatic SynthesisJoel Emer (DEC)Nikolas Gloy (Harvard)52003-10-117DifferencePrevious work: High-level construct from existing blocks Manually specify the predictorHere: Start from Low-level primitives Provide automatic help in the process of specification of the predictor2003-10-118Predictor NotationOnebit[d](PC;T) = P[1,d](PC;T) Where,PC = current program counterT = branch resolution (0->Not Taken, 1->Taken)Counter[n,d](I;T) = P[n,d](I:if T then P+1 else P-1);Examples:2003-10-119Branch Prediction (BP) Language The parser understands– Predictor primitives– Composition of predictors– Logic expressions (XOR, IF)– Functions (SATURATE, MASK_HI) Translation and Simulation Can link to a trace reader2003-10-120Genetic Programming Search Represent expressions and describes predictors in BP language Parse algebraic expressions into tree-like structure Adapt genetic operations on tree node individuals¾Replication¾Crossover¾Mutation¾Encapsulation62003-10-121Adaptation Process1. Create initial population of randomly generated individuals2. Rank fitness of individuals in the population by simulation3. Apply genetic operations to create new generation4. Repeat steps 2 and 32003-10-122Example: Crossover Operation Randomly choose crossover nodes, exchange the subtree2003-10-123Example: Mutation OperationPredictor NodeWidth HeightPredictor Node’Width’Height’XORIFIf it is a predictor node, we change the width and/or height.If it is a function node, we replace it with a different function.2003-10-124Results of Branch Predictors72003-10-125Questions gshare still works better! Results: very deep tree structures, probably not directly implementable!2003-10-126Conclusion Present a new language for describing predictors BP language allows for automatic manipulation, including generating simulators and automated synthesis. Use GP to search the design space for branch prediction.Improving Branch Predictors by Correlating on Data ValuesTimothy H. Heil, Zak Smith, J.E. Smith2003-10-128Why Are Branches Mispredicted ? History register not wide enough!– Can’t see the outcome for the last iteration of a loop of even moderate length Is branch history a good heuristic?– Programs structured to manipulate DATA82003-10-129Value History Data: values of registers used to compute a branch condition code Typically use “ra-rb”– Aside: how would this be done on Alpha? Correlate on data history the same way previous schemes correlate on branch history E.g., accurately predict the last branch of a loop—counter should approach 0.2003-10-130Example gcc front-end Giant switch statement—branch history useless 2003-10-131Condition-setting instructions e.g, cmple ra, rb, rc followed by a beq Maintain a table indexed by branch PC containing the values used for condition code computation “valid” bit set by a subset of instructions2003-10-132The Branch Difference Predictor92003-10-133The Branch Difference Predictor Backing Predictor– Conventional global branch history predictor often does well enough– Fails in “exceptional” cases– Use Rare Event Predictor for all branches previously mis-predicted by BP Rare Event Predictor (REP)– Cache of difficult-to-predict branches– Separate prediction logic (e.g., 2-bit counters) Value history table– Per-branch difference history– (one entry typically enough)– Used for validating REP entries2003-10-134Results 1: history depth? Diminishing returns on increasing history buffer depth2003-10-135Results 2: Performance2003-10-136Discussion Conclusion:– Clearly outperforms branch-history-correlated schemes Improvements?– Branch register values natural choice– What other program state can be captured to aid with
View Full Document