DOC PREVIEW
CMU CS 15213 - mpr-branchpredict

This preview shows page 1-2 out of 5 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MICROPROCESSOR REPORT New Algorithm Improves Branch Prediction Better Accuracy Required for Highly Superscalar Designs by Linley Gwennap Intel s P6 processor see 090202 PDF is the first to use a two level branch prediction algorithm to improve accuracy This algorithm first published by Tse Yu Yeh and Yale Patt has the potential to push accuracy well beyond the 90 level achieved by the best processors today As future processors look to improve performance by increasing the issue rate and or extending the pipeline depth the two level algorithm is likely to become more common Branch prediction has been a problem for CPU designers since the advent of pipelining A pipelined processor must fetch the next instruction before the current one has executed If the current instruction is a conditional branch the processor must decide whether to fetch from the target address assuming the branch will be taken or from the next sequential address assuming the branch will not be taken An incorrect guess causes the pipeline to stall until it is refilled with valid instructions this delay is called the mispredicted branch penalty Processors with a simple five stage pipeline typically have a two cycle branch penalty For a four way superscalar design however this could mean a loss of eight instructions If the pipeline is extended the branch penalty usually increases resulting in the loss of even more instructions Since programs typically encounter branches every 4 6 instructions inaccurate branch prediction causes a severe performance degradation in highly superscalar or deeply pipelined designs Initial efforts at branch prediction used simple algorithms based on the direction of the branch Among commercial microprocessors the MIPS R6000 pioneered the use of compiler hints to direct branch prediction Digital s 21064 was the first microprocessor to store branch history information with the P6 leading the way to two level prediction This article reviews these earlier algorithms before explaining the new two level method in more detail Simple Hardware Can Achieve 65 For scalar processors with relatively short pipelines branch prediction is less of a concern In fact for processors with a branch delay slot the branch penalty can be as little as one cycle The default prediction method for simple pipelined designs is to assume that branches are not taken always fetching sequential instructions The 486 and most embedded processors use this scheme because of its simplicity and low cost New Algorithm Improves Branch Prediction It turns out however that conditional branches are taken more often than not Most programs make heavy use of loops which repeatedly branch to the same address Simulations show that conditional branches are taken about 60 of the time in the SPECint89 suite and more often in scientific code such as the SPECfp89 benchmarks 1 Thus a simple optimization is to always predict branches to be taken A better algorithm takes into account the direction of the branch Backward branches typically complete loop iterations and thus are taken as much as 80 of the time or more Forward branches are more difficult to predict but tend to be not taken more often than taken Thus by simply looking at the direction of the branch usually available as the sign bit of the offset a processor can predict backward branches taken and forward branches not taken This BTFN algorithm succeeds about 65 of the time for SPECint89 MicroSparc 2 and most PA RISC processors use BTFN With appropriate instruction set hooks the compiler can improve branch prediction accuracy Because it has access to the source code a good compiler can recognize code sequences that are likely to branch such as loops and those that are unlikely to branch such as exception checking Current MIPS and PowerPC chips among others implement special branch instructions that encode the compiler s prediction in a single bit Compilers can take further advantage of these predicted branch instructions by using a technique called profiling or feedback directed compilation After the program is initially compiled it is run using test data to determine the typical direction of each branch the program is then recompiled to adjust the branch prediction bits According to IBM its compilers achieve 75 accuracy on SPECint92 using this technique Dynamic Prediction Uses History The previous algorithms are classified as static schemes because any particular branch is always predicted in the same way whenever it is encountered To achieve greater accuracy dynamic algorithms take into account run time information The processor learns from its mistakes and changes its predictions to match the behavior of each particular branch A dynamic algorithm keeps a record of previous branch behavior allowing it to improve its predictions over time A simple scheme published by James Smith in 1981 2 maintains a single history bit for each branch When a branch is encountered it is predicted to go the Vol 9 No 4 March 27 1995 1995 MicroDesign Resources MICROPROCESSOR REPORT ST 3 Predict Taken Not Taken Not Taken WT WNT 2 1 Taken Taken Predict Not Taken SNT 0 Not Taken Taken Not Taken Taken Figure 1 In the two bit Smith algorithm the two history bits implement a state machine with four possible states strongly taken ST weakly taken WT weakly not taken WNT and strongly not taken SNT In ST and WT future branches are predicted taken in WNT and SNT branches are predicted not taken same way it did the previous time as indicated by the bit This technique can push accuracy to 80 As a practical matter there are two ways to implement this scheme The history bits can be kept in the instruction cache for example one per every four instructions When instructions are fetched from the cache the history bit comes along If the bit is set that group of instructions contains a predicted taken branch and the fetch stream is redirected In this example the storage overhead would be less than 1 of the cache area Although this method used by Digital s Alpha AMD s K5 and other processors provides dynamic prediction with minimal cost it has some drawbacks Some groups of instructions will not contain a branch wasting the history bit Groups with multiple branches create interference as the history of one branch overwrites that of another in the same group Processors such as Pentium store the history bits in a separate branch history table BHT assigning one entry per branch By avoiding the interference and unused bits of


View Full Document

CMU CS 15213 - mpr-branchpredict

Documents in this Course
lecture

lecture

14 pages

lecture

lecture

46 pages

Caches

Caches

9 pages

lecture

lecture

39 pages

Lecture

Lecture

36 pages

Lecture

Lecture

45 pages

Lecture

Lecture

56 pages

lecture

lecture

11 pages

lecture

lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

37 pages

Exam

Exam

16 pages

Lecture

Lecture

10 pages

Lecture

Lecture

43 pages

Lecture

Lecture

8 pages

Lecture

Lecture

8 pages

Lecture

Lecture

36 pages

Lecture

Lecture

43 pages

Lecture

Lecture

12 pages

Lecture

Lecture

37 pages

Lecture

Lecture

6 pages

Lecture

Lecture

40 pages

coding

coding

2 pages

Exam

Exam

17 pages

Exam

Exam

14 pages

Lecture

Lecture

29 pages

Lecture

Lecture

34 pages

Exam

Exam

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

37 pages

Lecture

Lecture

36 pages

lecture

lecture

46 pages

Lecture

Lecture

33 pages

Lecture

Lecture

57 pages

Lecture

Lecture

32 pages

Lecture

Lecture

46 pages

Lecture

Lecture

40 pages

Lecture

Lecture

11 pages

Lecture

Lecture

6 pages

Lecture

Lecture

43 pages

Lecture

Lecture

12 pages

Lecture

Lecture

18 pages

Exam

Exam

10 pages

Lecture

Lecture

45 pages

Lecture

Lecture

37 pages

Exam

Exam

24 pages

class09

class09

21 pages

class22

class22

37 pages

class20

class20

30 pages

class27

class27

33 pages

class25

class25

21 pages

class04

class04

31 pages

Lecture

Lecture

59 pages

class01a

class01a

14 pages

class12

class12

45 pages

class29

class29

33 pages

Lecture

Lecture

39 pages

Lecture

Lecture

6 pages

class03

class03

34 pages

lecture

lecture

42 pages

Lecture

Lecture

40 pages

Lecture

Lecture

47 pages

Exam

Exam

19 pages

R06-B

R06-B

25 pages

class17

class17

37 pages

class25

class25

31 pages

Lecture

Lecture

15 pages

final-f06

final-f06

17 pages

Lecture

Lecture

9 pages

lecture

lecture

9 pages

Exam

Exam

15 pages

Lecture

Lecture

22 pages

class11

class11

45 pages

lecture

lecture

50 pages

Linking

Linking

37 pages

Lecture

Lecture

64 pages

Integers

Integers

40 pages

Exam

Exam

11 pages

Lecture

Lecture

37 pages

Lecture

Lecture

44 pages

Lecture

Lecture

37 pages

Lecture

Lecture

9 pages

Lecture

Lecture

37 pages

Lecture

Lecture

45 pages

Final

Final

25 pages

lecture

lecture

9 pages

Lecture

Lecture

30 pages

Lecture

Lecture

16 pages

Final

Final

17 pages

Lecture

Lecture

8 pages

Exam

Exam

11 pages

Lecture

Lecture

47 pages

Lecture

Lecture

9 pages

lecture

lecture

39 pages

Exam

Exam

11 pages

lecture

lecture

41 pages

lecture

lecture

37 pages

Lecture

Lecture

59 pages

Lecture

Lecture

45 pages

Exam 1

Exam 1

18 pages

Lecture

Lecture

41 pages

Lecture

Lecture

32 pages

Lecture

Lecture

30 pages

Lecture

Lecture

9 pages

Lecture

Lecture

9 pages

Lecture

Lecture

15 pages

Lecture

Lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

34 pages

Lecture

Lecture

40 pages

Lecture

Lecture

4 pages

Lecture

Lecture

46 pages

Lecture

Lecture

8 pages

Lecture

Lecture

65 pages

Lecture

Lecture

38 pages

Lecture

Lecture

35 pages

Lecture

Lecture

8 pages

Lecture

Lecture

34 pages

Lecture

Lecture

8 pages

Exam

Exam

13 pages

Lecture

Lecture

43 pages

Lecture

Lecture

9 pages

Lecture

Lecture

12 pages

Lecture

Lecture

9 pages

Lecture

Lecture

34 pages

Lecture

Lecture

43 pages

Lecture

Lecture

7 pages

Lecture

Lecture

45 pages

Lecture

Lecture

24 pages

Lecture

Lecture

47 pages

Lecture

Lecture

12 pages

Lecture

Lecture

20 pages

Lecture

Lecture

9 pages

Exam

Exam

11 pages

Lecture

Lecture

52 pages

Lecture

Lecture

20 pages

Exam

Exam

11 pages

Lecture

Lecture

35 pages

Lecture

Lecture

47 pages

Lecture

Lecture

18 pages

Lecture

Lecture

30 pages

Lecture

Lecture

59 pages

Lecture

Lecture

37 pages

Lecture

Lecture

22 pages

Lecture

Lecture

35 pages

Exam

Exam

23 pages

Lecture

Lecture

9 pages

Lecture

Lecture

22 pages

class12

class12

32 pages

Lecture

Lecture

8 pages

Lecture

Lecture

39 pages

Lecture

Lecture

44 pages

Lecture

Lecture

38 pages

Lecture

Lecture

69 pages

Lecture

Lecture

41 pages

Lecture

Lecture

12 pages

Lecture

Lecture

52 pages

Lecture

Lecture

59 pages

Lecture

Lecture

39 pages

Lecture

Lecture

83 pages

Lecture

Lecture

59 pages

class01b

class01b

17 pages

Exam

Exam

21 pages

class07

class07

47 pages

Lecture

Lecture

11 pages

Odyssey

Odyssey

18 pages

multicore

multicore

66 pages

Lecture

Lecture

6 pages

lecture

lecture

41 pages

lecture

lecture

55 pages

lecture

lecture

52 pages

lecture

lecture

33 pages

lecture

lecture

46 pages

lecture

lecture

55 pages

lecture

lecture

17 pages

lecture

lecture

49 pages

Exam

Exam

17 pages

lecture

lecture

56 pages

Exam 2

Exam 2

16 pages

Exam 2

Exam 2

16 pages

Notes

Notes

37 pages

Lecture

Lecture

40 pages

Lecture

Lecture

36 pages

Lecture

Lecture

43 pages

Lecture

Lecture

25 pages

Exam

Exam

13 pages

Lecture

Lecture

32 pages

Lecture

Lecture

12 pages

Lecture

Lecture

58 pages

Lecture

Lecture

29 pages

Lecture

Lecture

59 pages

Lecture

Lecture

41 pages

Lecture

Lecture

50 pages

Exam

Exam

17 pages

Lecture

Lecture

29 pages

Lecture

Lecture

44 pages

Lecture

Lecture

41 pages

Lecture

Lecture

52 pages

Lecture

Lecture

40 pages

Lecture

Lecture

33 pages

lecture

lecture

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

29 pages

Lecture

Lecture

39 pages

Lecture

Lecture

9 pages

Lecture

Lecture

29 pages

Lecture

Lecture

8 pages

Lecture

Lecture

43 pages

Lecture

Lecture

43 pages

Lecture

Lecture

75 pages

Lecture

Lecture

55 pages

Exam

Exam

12 pages

Lecture

Lecture

43 pages

Lecture

Lecture

35 pages

lecture

lecture

36 pages

Exam

Exam

33 pages

lecture

lecture

56 pages

lecture

lecture

64 pages

lecture

lecture

8 pages

Exam

Exam

14 pages

Lecture

Lecture

43 pages

Lecture

Lecture

36 pages

lecture

lecture

56 pages

lecture

lecture

75 pages

lecture

lecture

36 pages

Lecture

Lecture

50 pages

Lecture

Lecture

45 pages

Lecture

Lecture

13 pages

Exam

Exam

23 pages

Lecture

Lecture

10 pages

Lecture

Lecture

48 pages

Lecture

Lecture

83 pages

lecture

lecture

57 pages

Lecture

Lecture

33 pages

Lecture

Lecture

39 pages

Lecture

Lecture

33 pages

lecture

lecture

54 pages

Lecture

Lecture

30 pages

Exam

Exam

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

40 pages

Exam

Exam

17 pages

Lecture

Lecture

9 pages

Exam

Exam

15 pages

Lecture

Lecture

44 pages

Lecture

Lecture

34 pages

Lecture

Lecture

24 pages

Lecture

Lecture

29 pages

class12

class12

43 pages

lecture

lecture

43 pages

class22

class22

22 pages

R06-B

R06-B

25 pages

class01b

class01b

19 pages

lecture

lecture

29 pages

lab1

lab1

8 pages

Caches

Caches

36 pages

lecture

lecture

55 pages

Lecture,

Lecture,

37 pages

Integers

Integers

40 pages

Linking

Linking

38 pages

lecture

lecture

45 pages

Lecture

Lecture

61 pages

Linking

Linking

33 pages

lecture

lecture

40 pages

lecture

lecture

40 pages

Lecture

Lecture

32 pages

lecture

lecture

48 pages

lecture

lecture

44 pages

Exam

Exam

11 pages

Lecture

Lecture

31 pages

Lecture

Lecture

46 pages

Lecture

Lecture

40 pages

Lecture

Lecture

40 pages

Exam

Exam

12 pages

Lecture

Lecture

42 pages

Lecture

Lecture

36 pages

Lecture

Lecture

45 pages

Lecture

Lecture

41 pages

Lecture

Lecture

13 pages

Lecture

Lecture

35 pages

Lecture

Lecture

20 pages

Final

Final

19 pages

Lecture

Lecture

33 pages

Lecture

Lecture

50 pages

Lecture

Lecture

33 pages

Lecture

Lecture

27 pages

Lecture

Lecture

6 pages

Exam

Exam

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

23 pages

Lecture

Lecture

43 pages

Lecture

Lecture

32 pages

Lecture

Lecture

52 pages

Lecture

Lecture

37 pages

Lecture

Lecture

36 pages

Lecture

Lecture

34 pages

Lecture

Lecture

40 pages

Lecture

Lecture

15 pages

lecture

lecture

21 pages

Lecture

Lecture

58 pages

Lecture

Lecture

49 pages

Lecture

Lecture

36 pages

Lecture

Lecture

11 pages

Lecture

Lecture

12 pages

Lecture

Lecture

58 pages

Lecture

Lecture

33 pages

Exam

Exam

15 pages

Lecture

Lecture

35 pages

Lecture

Lecture

10 pages

Lecture

Lecture

25 pages

Lecture

Lecture

31 pages

Lecture

Lecture

24 pages

Lecture

Lecture

34 pages

Lecture

Lecture

50 pages

lecture

lecture

35 pages

Lecture

Lecture

11 pages

Lecture

Lecture

39 pages

Lecture

Lecture

45 pages

Lecture

Lecture

41 pages

exam1-f05

exam1-f05

11 pages

Lecture

Lecture

4 pages

Lecture

Lecture

17 pages

Exam

Exam

17 pages

malloc()

malloc()

12 pages

Lecture

Lecture

57 pages

Lecture

Lecture

30 pages

Lecture

Lecture

30 pages

Lecture

Lecture

47 pages

Lecture

Lecture

33 pages

Exam

Exam

12 pages

Lecture

Lecture

43 pages

Lectures

Lectures

33 pages

Lecture

Lecture

36 pages

lecture

lecture

33 pages

Exam

Exam

14 pages

Lecture

Lecture

43 pages

Lecture

Lecture

25 pages

Load more
Download mpr-branchpredict
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view mpr-branchpredict and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view mpr-branchpredict and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?