Unformatted text preview:

Understanding the Differences Between Value Prediction and Instruction Reuse Avinash Sodani and Gurindar S Sohi Computer Sciences Department University of Wisconsin Madison 1210 West Dayton Street Madison WI 53706 USA sodani sohi cs wisc edu Abstract Recently two hardware techniques Value Prediction VP and Instruction Reuse IR have been proposed for exploiting the redundancy in programs to collapse data dependences In this paper we attempt to understand the different ways in which VP and IR interact with other microarchitectural features and the impact of such interactions on net performance More specifically we perform the following tasks i we identify the various differences between the two techniques and qualitatively discuss their microarchitectural interactions ii we evaluate the impact on performance of these interactions and iii since IR is more restrictive of the two techniques we also estimate the amount of total redundancy present in programs that can be captured by IR Our results show that the performance obtained by VP is sensitive to the way branches with value speculative operands are handled We also see that although IR captures less amount of redundancy it may perform equally well because it validates results early it is non speculative and it reduces branch misprediction penalty Finally we show that 84 97 of redundancy in programs can be reused implying that the approach of detecting redundant instructions non speculatively based on their operands does not significantly restrict IR s ability to capture redundancy present in programs 1 Introduction Several recent studies 2 5 8 10 have shown that there is significant result redundancy in programs i e many instructions perform the same computation and hence produce the same result over and over again These studies have found that for several benchmarks more than 75 of the dynamic instructions produce the same result as before Also recently two hardware techniques have been proposed to exploit this redundancy i Value Prediction VP 3 4 5 and ii Instruction Reuse IR 9 Both techniques attempt to reduce the execution time of programs by alleviating the dataflow constraint They use the redundancy in programs to determine speculatively Value Prediction or non speculatively Instruction Reuse the results of instructions without actually executing them The advantage of doing so is that instructions do not have to wait for their source instructions to execute first they can execute sooner using the results obtained by the above two techniques thus relaxing the dataflow constraint Although both VP and IR attempt to shorten the critical path through a computation they follow very different approaches VP predicts the results of instructions or alternatively the inputs of other instructions based on the previously seen results performs computation using the predicted values and confirms the speculation at a later point The critical path is shortened since the instructions that would normally be executed sequentially could be executed speculatively in parallel On the other hand IR recognizes that a certain computation chain has been performed before and therefore need not be performed again i e it splices out a chain of computation from the critical path The effectiveness of any microarchitectural technique in improving the net performance of a processor not only depends on how well it performs by itself but also on how it interacts with other microarchitectural features e g branch prediction availability of resources when it is integrated in a pipeline Since VP and IR are different techniques they not only perform differently by themselves i e capture different amounts of the redundancy present in programs but also interact with other microarchitectural features in different ways thereby impacting the net performance differently The purpose of this work is to identify and evaluate the different microarchitectural interactions of these techniques The intent is not to argue which technique is better but is to gain a better understanding of the working of each technique We feel that will help in designing other techniques possibly hybrid of VP and IR that exploit the redundancy in programs more profitably More specifically in this paper we achieve the following three tasks i We identify the various differences between the two techniques and qualitatively discuss their microarchitectural interactions ii We evaluate the impact on performance of these interactions And finally iii since IR is more restrictive of the two techniques we discuss this later we also estimate how much of the total redundancy present in programs can be captured by IR The layout for the rest of the paper is as follows In Section 2 we describe VP and IR in more detail In Section 3 we identify the various differences between them and qualitatively discuss various interactions and their the impacts on performance In Section 4 we evaluate these interactions quantitatively Finally in Section 5 we summarize and provide conclusions Fetch Decode Rename VPT PC Access prediction PC Issue RB Access Execute Issue When an instruction is first executed its results are stored in a hardware structure called a Reuse Buffer RB indexed by its PC When the instruction is encountered again its previous results are read from the RB in parallel with fetching the instruction and their validity established by a reuse test in parallel with decoding the instruction The reuse test validates results by establishing that the current operands values are the same as those used to calculate the results There are different ways of doing so one of which is described later in Section 4 1 2 of this paper Since the correct results are known a reused instruction is not executed and instead it is queued for retirement IR collapses true dependences by reusing in the same cycle a dependent chain of instructions that would normally execute sequentially In Figure 2 we illustrate how VP and IR improve performance by collapsing data dependences In the figure we show a flow of a dependent chain of instructions I J and K through three different pipelines i a base pipeline without VP or IR ii a pipeline with VP and iii a pipeline with IR In all three cases we assume the instructions I J and K are fetched decoded and renamed together In the base pipeline the instructions execute sequentially since they are data dependent requiring three cycles to execute them the chain is committed by cycle 6 In the pipeline


View Full Document

CMU CS 15740 - Understanding the Differences Between Value Prediction

Documents in this Course
leecture

leecture

17 pages

Lecture

Lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

9 pages

Lecture

Lecture

13 pages

lecture

lecture

25 pages

lect17

lect17

7 pages

Lecture

Lecture

65 pages

Lecture

Lecture

28 pages

lect07

lect07

24 pages

lect07

lect07

12 pages

lect03

lect03

3 pages

lecture

lecture

11 pages

lecture

lecture

20 pages

lecture

lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

10 pages

Lecture

Lecture

22 pages

Lecture

Lecture

28 pages

Lecture

Lecture

18 pages

lecture

lecture

63 pages

lecture

lecture

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

18 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

lecture

lecture

34 pages

lecture

lecture

47 pages

lecture

lecture

7 pages

Lecture

Lecture

18 pages

Lecture

Lecture

7 pages

Lecture

Lecture

21 pages

Lecture

Lecture

10 pages

Lecture

Lecture

39 pages

Lecture

Lecture

11 pages

lect04

lect04

40 pages

Load more
Loading Unlocking...
Login

Join to view Understanding the Differences Between Value Prediction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Understanding the Differences Between Value Prediction and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?