Unformatted text preview:

MIT OpenCourseWare http://ocw.mit.edu 6.189 Multicore Programming Primer, January (IAP) 2007 Please use the following citation format: Saman Amarasinghe, 6.189 Multicore Programming Primer, January (IAP) 2007. (Massachusetts Institute of Technology: MIT OpenCourseWare). http://ocw.mit.edu (accessed MM DD, YYYY). License: Creative Commons Attribution-Noncommercial-Share Alike. Note: Please use the actual date you accessed this material in your citation. For more information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms6.189 IAP 2007 Lecture 3 Introduction to Parallel Architectures Prof. Saman Amarasinghe, MIT. 1 6.189 IAP 2007 MITImplicit vs. Explicit Parallelism Implicit Explicit Superscalar Processors Hardware Compiler Explicitly Parallel Architectures Prof. Saman Amarasinghe, MIT. 2 6.189 IAP 2007 MITOutline ● Implicit Parallelism: Superscalar Processors ● Explicit Parallelism ● Shared Instruction Processors ● Shared Sequencer Processors ● Shared Network Processors ● Shared Memory Processors ● Multicore Processors Prof. Saman Amarasinghe, MIT. 3 6.189 IAP 2007 MITImplicit Parallelism: Superscalar Processors ● Issue varying numbers of instructions per clock  statically scheduled – using compiler techniques – in-order execution  dynamically scheduled – Extracting ILP by examining 100’s of instructions – Scheduling them in parallel as operands become available – Rename registers to eliminate anti dependences – out-of-order execution – Speculative execution Prof. Saman Amarasinghe, MIT. 4 6.189 IAP 2007 MITPipelining Execution IF: Instruction fetch ID : Instruction decode EX : Execution WB : Write back Cycles Instruction # 1 2 3 4 5 6 7 Instruction i IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB Instruction i+1 Instruction i+2 Instruction i+3 Instruction i+4 Prof. Saman Amarasinghe, MIT. 5 6.189 IAP 2007 MIT 8Super-Scalar Execution Cycles Instruction type 1 2 3 4 5 6 7 Integer IF ID EX WB Floating point IF ID EX WB Integer Floating point Integer Floating point Integer Floating point IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX WB 2-issue super-scalar machine Prof. Saman Amarasinghe, MIT. 6 6.189 IAP 2007 MITData Dependence and Hazards ● InstrJ is data dependent (aka true dependence) on InstrI: I: add r1,r2,r3J: sub r4,r1,r3 ● If two instructions are data dependent, they cannot execute simultaneously, be completely overlapped or execute in out-of-order ● If data dependence caused a hazard in pipeline, called a Read After Write (RAW) hazard Prof. Saman Amarasinghe, MIT. 7 6.189 IAP 2007 MITILP and Data Dependencies, Hazards ● HW/SW must preserve program order: order instructions would execute in if executed sequentially as determined by original source program  Dependences are a property of programs ● Importance of the data dependencies  1) indicates the possibility of a hazard  2) determines order in which results must be calculated  3) sets an upper bound on how much parallelism can possibly be exploited ● Goal: exploit parallelism by preserving program order only where it affects the outcome of the program Prof. Saman Amarasinghe, MIT. 8 6.189 IAP 2007 MITName Dependence #1: Anti-dependence ● Name dependence: when 2 instructions use same register or memory location, called a name, but no flow of data between the instructions associated with that name; 2 versions of name dependence ● InstrJ writes operand before InstrI reads it I: sub r4,r1,r3J: add r1,r2,r3K: mul r6,r1,r7 Called an “anti-dependence” by compiler writers. This results from reuse of the name “r1” ● If anti-dependence caused a hazard in the pipeline, called a Write After Read (WAR) hazard Prof. Saman Amarasinghe, MIT. 9 6.189 IAP 2007 MITName Dependence #2: Output dependence ● InstrJ writes operand before InstrI writes it. I: sub r1,r4,r3J: add r1,r2,r3K: mul r6,r1,r7 ● Called an “output dependence” by compiler writers. This also results from the reuse of name “r1” ● If anti-dependence caused a hazard in the pipeline, called aWrite After Write (WAW) hazard ● Instructions involved in a name dependence can execute simultaneously if name used in instructions is changed so instructions do not conflict  Register renaming resolves name dependence for registers  Renaming can be done either by compiler or by HW Prof. Saman Amarasinghe, MIT. 10 6.189 IAP 2007 MITControl Dependencies ● Every instruction is control dependent on some set ofbranches, and, in general, these control dependencies mustbe preserved to preserve program orderif p1 {S1; }; if p2 { S2; } ● S1 is control dependent on p1, and S2 is control dependent on p2 but not on p1. ● Control dependence need not be preserved  willing to execute instructions that should not have been executed, thereby violating the control dependences, if can do so without affecting correctness of the program ● Speculative Execution Prof. Saman Amarasinghe, MIT. 11 6.189 IAP 2007 MITSpeculation ● Greater ILP: Overcome control dependence by hardware speculating on outcome of branches and executing program as if guesses were correct  Speculation ⇒ fetch, issue, and execute instructions as if branch predictions were always correct  Dynamic scheduling ⇒ only fetches and issues instructions ● Essentially a data flow execution model: Operations execute as soon as their operands are available Prof. Saman Amarasinghe, MIT. 12 6.189 IAP 2007 MITSpeculation in Rampant in Modern Superscalars ● Different predictors  Branch Prediction  Value Prediction  Prefetching (memory access pattern prediction) ● Inefficient  Predictions can go wrong  Has to flush out wrongly predicted data  While not impacting performance, it consumes power Prof. Saman Amarasinghe, MIT. 13 6.189 IAP 2007 MITToday’s CPU Architecture: Heat becoming an unmanageable problem Intel Developer Forum, Spring 2004 - Pat Gelsinger Cube relationship between the cycle time and pow (Pentium at 90 W) Prof. Saman Amarasinghe, MIT. 14 6.189 IAP 2007 MIT '70 '80 '90 '00 '1010,0001,000100101Power density (W/cm2)40048008808680858080286386486Sun's SurfaceRocket NozzleNuclear ReactorHot PlateImage by MIT OpenCourseWare.Pentium-IV ● Pipelined  minimum of 11 stages for any instruction ● Instruction-Level


View Full Document

MIT 6 189 - Lecture 3 Introduction to Parallel Architectures

Download Lecture 3 Introduction to Parallel Architectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 Introduction to Parallel Architectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 Introduction to Parallel Architectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?