New version page

LOOP TERMINATION PREDICTION

Upgrade to remove ads

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

In Proceedings of the 3rd International Symposium on High Performance Computing (ISHPC2K), October 2000, (c)Springer-Verlag.Loop Termination PredictionTimothy Sherwood Brad CalderDepartment of Computer Science and EngineeringUniversity of California, San Diegofsherwood,[email protected] pipelined high performance processors require highly accurate branch prediction to drivetheir instruction fetch. However there remains a class of events which are not easily predictable by stan-dard two level predictors. One such event is loop termination. In deeply nested loops, loop terminationscan account for a significant amount of the mispredictions. We propose two techniques for dealing withloop terminations. A simple hardware extension to existing prediction architectures called Loop Termi-nation Prediction is presented, which captures the long regular repeating patterns of loops. In addition, asoftware technique called Branch Splitting is examined, which breaks loops with iteration counts abovethe detection of current predictors into smaller loops that may be effectively captured. Our results showthat for many programs adding a small loop termination buffer can reduce the missprediction rate by upto a difference of 2%.1 IntroductionBranch prediction is the architectural feature which allows the front-end of the processor to continue fetchinginstructions in the presence of control flow changes. Branch prediction predicts the directions of branchesduring fetch, so the fetch engine knows which cache block to fetch from in the next cycle. When the wrongdirection is predicted, the whole pipeline following the branch must be flushed, significantly impactingperformance.Accurate branch prediction uses the past history of branches to predict the future behavior of a branch.Early branch prediction architectures used an N-bit saturating counter to predict the direction of eachbranch [11]. Using only an N-bit counter accurately predicts branches which are biased in either a taken ornot-taken direction, but looses prediction accuracy if the branch exhibits a more complex pattern. To capturethese patterns local and global branch history prediction were proposed [16, 14].Local branch history can be used to accurately predict a branch by storing the last L directions for agiven branch, and using this to index into a 2nd level Pattern History Table (PHT) [16, 14]. The PHT is atable of N-bit saturating counters used to predict the direction of the branch. Local branch history increasesprediction accuracy by capturing arbitrary patterns that are repeated within the last L times the branch wasexecuted.Global branch history uses correlation information between branches to accurately predict a branch’soutcome. A history of the last G executed branch directions are kept track of in a global history register.12 LOOP TERMINATION PREDICTION2The global history register is then used to index into a pattern history table of 2-bit counters to predict thebranch direction.It is hard for local and global histories to capture loop terminations of the pattern((1)N0)m, where 1represents a taken branch and a 0 represents a fall-through branch. These patterns cannot be captured by alocal history predictor if N is larger than L (the local history size). Loop termination can only be capturedusing global history if N is smaller than G1or there is a unique branching sequence right before the looptermination. For the programs examined in this study, 43% of the executed branches are loop branches, and7% of the executed loop branches are mispredicted on average.In this paper we propose to use a Loop Termination Buffer (LTB) to predict branch patterns of type((1)N0)m. The LTB keeps track of branches with this behavior, and predicts when the pattern changes(terminates). This allows us to achieve up to 100% prediction accuracy for loop branches after a short warmup period.In addition, we examine the potential of using a software approach called Branch Splitting to correctlypredict loop branches. Local branch history can only accurately predict loop terminations for branches thatexecute less than L times, where L is the number of bits used for the local history. For loops that have aniteration count larger than L, the loop guarding branch can be split into two or more branches all whichhave an interaction count less than L, as long as the product of new iteration counts equals the old iterationcount. The new branches’ patterns will then fit into local history and will have all of their backwards andtermination behavior accurately predicted.The rest of the paper is organized as follows. Section 2 describes Loop Termination Prediction and ourLoop Termination Buffer implementation. Section 3 describes Branch Splitting. Simulation methodologyis described in section 4. Section 5 evaluates the performance of Loop Termination Prediction and BranchSplitting. Section 6 describes prior research for loop termination prediction. Finally, we summarize ourcontributions in section 7.2 Loop Termination PredictionTraditionally loops are thought of as the steady state operation of execution, and at first glance loop ter-mination seems to account for only a small portion of the total amount of branches seen. However, this isan important part of a branch prediction architecture, because a regular loop has the miss rate which is theinverse of it’s iteration count. Since the branch prediction accuracy of most processors is already in the highnineties, loop branch mispredictions can account for a large fraction of the remaining branch mispredictions.In this section, we propose using a very simple architecture extension which can predict how many timesa loop branch will iterate to provide loop termination prediction.2.1 Predicting Loop TerminationTo allow instruction fetch to continue without stalling, each cycle a traditional branch prediction architecturepredicts (1) if the current instruction fetch contains a branch, (2) the direction of the branch, and (3) providesthe branch target address to be used in the I-cache fetch in the next cycle. This prediction is typicallyperformed given only the current instruction cache fetch address each cycle.The goal of our research is to predict when a loop branch terminates its looping behavior. Therefore, inaddition to the above prediction information, during branch prediction we must (4) determine that the branchis a loop branch, and (5) predict if it has terminated or not. In order to provide


Download LOOP TERMINATION PREDICTION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LOOP TERMINATION PREDICTION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LOOP TERMINATION PREDICTION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?