DOC PREVIEW
Berkeley COMPSCI 252 - Instruction Level Parallelism and Dynamic Execution

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Page 1CS252/CullerLec 15. 13/12/02CS252Graduate Computer ArchitectureLecture 15: Instruction Level Parallelism and Dynamic ExecutionMarch 11, 2002Prof. David E. CullerComputer Science 252Spring 2002CS252/CullerLec 15. 23/12/02Recall from Pipelining Review• Pipeline CPI = Ideal pipeline CPI + Structural Stalls + Data Hazard Stalls + Control Stalls– Ideal pipeline CPI: measure of the maximum performance attainable by the implementation– Structural hazards: HW cannot support this combination of instructions– Data hazards: Instruction depends on result of prior instruction still in the pipeline– Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps)CS252/CullerLec 15. 33/12/02Recall Data Hazard Resolution: In-order issue, in-order completionTime (clock cycles)or r8, r2,r9Instr.Orderlw r1, 0(r2)sub r4,r1,r6and r6,r2,r7RegALUDMemIfetchRegRegIfetchALUDMem RegBubbleIfetchALUDMem RegBubbleRegIfetchALUDMemBubbleRegExtend to Multiple instruction issue?What if load had longer delay? Can and issue?CS252/CullerLec 15. 43/12/02In-Order Issue, Out-of-order Completion• Which hazards are present? RAW? WAR? WAW?• load r3 <- r1, r2• add r1 <- r5, r2• sub r3 <- r3, r1 or r3 <- r2, r1• Register Reservations– when issue mark destination register busy till complete– check all register reservations before issueRegALUIfetch RegAddDMem RegDMem’CS252/CullerLec 15. 53/12/02Ideas to Reduce StallsTechnique ReducesDynamic scheduling Data h azard stallsDynamic branchpredictionControl stallsIssuing multipleinstructions per cycleIdeal CPISpeculat ion Data and control stallsDynamic memorydisambiguationData h azard stalls involvingmemoryLoop unrolling Control hazard stallsBasic compiler p ipel inesche dulingData h azard stallsCompiler dependenceanalysisIdeal CPI and data hazard stallsSoftware pi pelining andtrace schedulingIdeal CPI and data hazard stallsCompiler speculation Ideal CPI, data and control stallsChapter 3Chapter 4CS252/CullerLec 15. 63/12/02Instruction-Level Parallelism (ILP)• Basic Block (BB) ILP is quite small– BB: a straight-line code sequence with no branches in except to the entry and no branches out except at the exit– average dynamic branch frequency 15% to 25% => 4 to 7 instructions execute between a pair of branches– Plus instructions in BB likely to depend on each other• To obtain substantial performance enhancements, we must exploit ILP across multiple basic blocks• Simplest: loop-level parallelism to exploit parallelism among iterations of a loop– Vector is one way– If not vector, then either dynamic via branch prediction or static via loop unrolling by compilerPage 2CS252/CullerLec 15. 73/12/02• InstrJis data dependent on InstrIInstrJtries to read operand before InstrIwrites it• or InstrJis data dependent on InstrKwhich is dependent on InstrI• Caused by a “True Dependence” (compiler term) • If true dependence caused a hazard in the pipeline, called a Read After Write (RAW) hazard Data Dependence and HazardsI: add r1,r2,r3J: sub r4,r1,r3CS252/CullerLec 15. 83/12/02• Dependences are a property of programs• Presence of dependence indicates potential for a hazard, but actual hazard and length of any stall is a property of the pipeline• Importance of the data dependencies1) indicates the possibility of a hazard2) determines order in which results must be calculated3) sets an upper bound on how much parallelism can possibly be exploited• Today looking at HW schemes to avoid hazardData Dependence and HazardsCS252/CullerLec 15. 93/12/02• Name dependence: when 2 instructions use same register or memory location, called a name, but no flow of data between the instructions associated with that name; 2 versions of name dependence• InstrJwrites operand before InstrIreads itCalled an “anti-dependence” in compiler work.This results from reuse of the name “r1”• If anti-dependence caused a hazard in the pipeline, called a Write After Read (WAR) hazardI: sub r4,r1,r3 J: add r1,r2,r3K: mul r6,r1,r7Name Dependence #1: Anti-dependenceCS252/CullerLec 15. 103/12/02Name Dependence #2: Output dependence• InstrJwrites operand before InstrIwrites it.• Called an “output dependence” by compiler writersThis also results from the reuse of name “r1”• If anti-dependence caused a hazard in the pipeline, called a Write After Write (WAW) hazardI: sub r1,r4,r3 J: add r1,r2,r3K: mul r6,r1,r7CS252/CullerLec 15. 113/12/02ILP and Data Hazards• program order: order instructions would execute in if executed sequentially 1 at a time as determined by original source program• HW/SW goal: exploit parallelism by preserving appearance of program order – modify order in manner than cannot be observed by program– must not affect the outcome of the program• Ex: Instructions involved in a name dependence can execute simultaneously if name used in instructions is changed so instructions do not conflict– Register renaming resolves name dependence for regs– Either by compiler or by HW– add r1, r2, r3– sub r2, r4,r5– and r3, r2, 1CS252/CullerLec 15. 123/12/02Control Dependencies• Every instruction is control dependent on some set of branches, and, in general, these control dependencies must be preserved to preserve program orderif p1 {S1;};if p2 {S2;}• S1 is control dependent on p1, and S2 is control dependent on p2 but not on p1.Page 3CS252/CullerLec 15. 133/12/02Control Dependence Ignored• Control dependence need not always be preserved– willing to execute instructions that should not have been executed, thereby violating the control dependences, if can do so without affecting correctness of the program • Instead, 2 properties critical to program correctness are exception behavior and data flowCS252/CullerLec 15. 143/12/02Exception Behavior• Preserving exception behavior => any changes in instruction execution order must not change how exceptions are raised in program (=> no new exceptions)• Example:DADDU R2,R3,R4BEQZ R2,L1LW R1,0(R2)L1:• Problem with moving LW before BEQZ?CS252/CullerLec 15. 153/12/02Data Flow• Data flow: actual flow of data values among instructions that produce results and those that consume them– branches make flow dynamic, determine which instruction is supplier of data• Example:DADDU R1,R2,R3BEQZ R4,LDSUBU R1,R5,R6L: …OR R7,R1,R8• OR depends on DADDU or DSUBU? Must preserve data flow on


View Full Document

Berkeley COMPSCI 252 - Instruction Level Parallelism and Dynamic Execution

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Instruction Level Parallelism and Dynamic Execution
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instruction Level Parallelism and Dynamic Execution and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instruction Level Parallelism and Dynamic Execution 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?