DOC PREVIEW
TAMU CSCE 350 - slide11

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction to PipeliningPipelining is Natural!Sequential LaundryPipelined Laundry: Start work ASAPPipelining LessonsPipelining Lessons [contd…]Slide 7Slide 8Slide 9Slide 10Five Stages of an InstructionConventional Pipelined Execution RepresentationExampleExample [contd…]DefinitionsSlide 16Pipeline HazardsPipeline Hazard [contd…]Pipleline Hazard [contd…]Summary Pipelining LessonsSummary of Pipeline HazardsIntroduction to PipeliningAdapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)Pipelining is Natural!•Laundry Example•Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold•Washer takes 30 minutes•Dryer takes 40 minutes•“Folder” takes 20 minutesA B C DSequential Laundry•Sequential laundry takes 6 hours for 4 loadsABCD30 40 20 30 40 20 30 40 20 30 40 206 PM7 8 91011MidnightTaskOrderTimePipelined Laundry: Start work ASAP•Pipelined laundry takes 3.5 hours for 4 loads ABCD6 PM7 8 91011MidnightTaskOrderTime30 40 40 40 40 20Pipelining Lessons•Latency vs. Throughput•Question–What is the latency in both cases ?–What is the throughput in both cases ?Pipelining doesn’t help latency of single task, it helps throughput of entire workloadABCD30 40 40 40 40 20Pipelining Lessons [contd…]•Question–What is the fastest operation in the example ?–What is the slowest operation in the examplePipeline rate limited by slowest pipeline stageABCD30 40 40 40 40 20Pipelining Lessons [contd…]ABCD30 40 40 40 40 20Multiple tasks operating simultaneously using different resourcesPipelining Lessons [contd…]•Question–Would the speedup increase if we had more steps ?ABCD30 40 40 40 40 20Potential Speedup = Number of pipe stagesPipelining Lessons [contd…]•Washer takes 30 minutes•Dryer takes 40 minutes•“Folder” takes 20 minutes•Question–Will it affect if “Folder” also took 40 minutesUnbalanced lengths of pipe stages reduces speedupPipelining Lessons [contd…]ABCD30 40 40 40 40 20Time to “fill” pipeline and time to “drain” it reduces speedupFive Stages of an Instruction•Ifetch: Instruction Fetch–Fetch the instruction from the Instruction Memory•Reg/Dec: Registers Fetch and Instruction Decode•Exec: Calculate the memory address•Mem: Read the data from the Data Memory•Wr: Write the data back to the register fileCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5Ifetch Reg/Dec Exec Mem WrLoadConventional Pipelined Execution RepresentationIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBProgram FlowTimeExampleExample [contd…]•Timepipeline = Timenon-pipeline / Pipe stages–Assumptions•Stages are perfectly balanced•Ideal conditionscpsc614Lec 1.15Performance(X) Execution_time(Y) n = =Performance(Y) Execution_time(X) Definitions•Performance is in units of things per sec–bigger is better•If we are primarily concerned with response time–performance(x) = 1 execution_time(x)" X is n times faster than Y" meansExample [contd…]•Speedup in this case = 24/14 = 1.7•Lets add 1000 more instructions–Time (non-pipelined) = 1000 x 8 + 24 ns = 8000 ns–Time (pipelined) = 1000 x 2 + 14 ns = 2014 ns–Speedup = 8000 / 2014 = 3.98 = 4 (approx) = 8/2Instruction throughput is important metric (as opposed to individual instruction)as real programs execute billions of instructions in practical case !!!Pipeline Hazards •Structural HazardIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBIFetch Dcd Exec Mem WBProgram FlowPipeline Hazard [contd…]•Control Hazard•Example–add $4, $5, $6–beq $1, $2, 40–lw $3, 300($0)Pipleline Hazard [contd…]•Data Hazards•Example–add $s0, $t0, $t1–sub $t2, $s0, $t3Summary Pipelining Lessons•Pipelining doesn’t help latency of single task, it helps throughput of entire workload•Pipeline rate limited by slowest pipeline stage•Multiple tasks operating simultaneously using different resources•Potential speedup = Number pipe stages•Unbalanced lengths of pipe stages reduces speedup•Time to “fill” pipeline and time to “drain” it reduces speedup•Stall for DependencesABCD6 PM7 8 9TaskOrderTime30 40 40 40 40 20Summary of Pipeline Hazards•Structural Hazards–Hardware design•Control Hazard–Decision based on results•Data Hazard–Data


View Full Document

TAMU CSCE 350 - slide11

Documents in this Course
Load more
Download slide11
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view slide11 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view slide11 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?