DOC PREVIEW
U of I CS 232 - Lecture notes

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A relevant questionThe slow wayLaundry PipeliningPipelining LessonsPipelining is not just MultiprocessingPipeliningInstruction execution reviewSingle-cycle datapath diagramExample: Instruction Fetch (IF)Instruction Decode (ID)Execute (EX)Memory (MEM)Writeback (WB)A bunch of lazy functional unitsPutting those slackers to workDecoding and fetching togetherExecuting, decoding and fetchingMaking Pipelining WorkBreak datapath into 5 stagesPipelining LoadsPipelining PerformancePipeline Datapath: Resource RequirementsPipelining other instruction typesImportant ObservationA solution: Insert NOP stagesSummaryJanuary 14, 2019 ©2003 Craig Zilles (derived from slides by Howard Huang and David Patterson)1A relevant questionAssuming you’ve got:—One washer (takes 30 minutes)—One drier (takes 40 minutes)—One “folder” (takes 20 minutes)It takes 90 minutes to wash, dry, and fold 1 load of laundry.—How long does 4 loads take?January 14, 2019 Pipelining 2The slow wayIf each load is done sequentially it takes 6 hours30 40 20 30 40 20 30 40 20 30 40 206 PM7 8 91011MidnightTimeJanuary 14, 2019 Pipelining 3Laundry PipeliningStart each load as soon as possible—Overlap loadsPipelined laundry takes 3.5 hours6 PM7 8 91011MidnightTime30 40 40 40 40 20January 14, 2019 Pipelining 4Pipelining LessonsPipelining doesn’t help latency of single load, it helps throughput of entire workloadPipeline rate limited by slowest pipeline stageMultiple tasks operating simultaneously using different resourcesPotential speedup = Number pipe stagesUnbalanced lengths of pipe stages reduces speedupTime to “fill” pipeline and time to “drain” it reduces speedup6 PM7 8 9Time30 40 40 40 40 20March 17, 2003 Pipelining 5Pipelining is not just MultiprocessingPipelining does involve parallel processing, but in a specific way.Both multiprocessing and pipelining relate to the processing of multiple “things” using multiple “functional units” —Multiprocessing implies each thing is processed entirely by a single functional unit•e.g., multiple lanes at the supermarket—In pipelining, each thing is broken into a sequence of pieces, where each piece is handled by a different (specialized) functional unit.•Supermarket analogy?Pipelining and multiprocessing are not mutually exclusive—Modern processors do both, with multiple pipelines (e.g., superscalar)January 14, 2019 Pipelining 6PipeliningPipelining is a general-purpose efficiency technique—It is not specific to processorsPipelining is used in:—Assembly lines—Bucket brigades—Fast food restaurantsPipelining is used in other CS disciplines:—Networking—Server software architectureUseful to increase throughput in the presence of long latency—More on that later…January 14, 2019 Pipelining 7Instruction execution reviewExecuting a MIPS instruction can take up to five steps.However, as we saw, not all instructions need all five steps.Step NameDescriptionInstruction FetchIF Read an instruction from memory.Instruction DecodeID Read source registers and generate control signals.Execute EX Compute an R-type result or a branch outcome.Memory MEM Read or write the data memory.Writeback WB Store a result in the destination register.InstructionSteps requiredbeq IF ID EXR-type IF ID EX WBsw IF ID EX MEMlw IF ID EX MEM WBJanuary 14, 2019 Pipelining 8Single-cycle datapath diagram4Shiftleft 2PCAddAdd0Mux1PCSrcReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegReadaddressInstructionmemoryInstruction[31-0]I [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteSignextend0Mux1ALUSrcResultZeroALUALUOp2ns2ns2ns1nsHow long does it take to execute each instruction?January 14, 2019 Pipelining 10Example: Instruction Fetch (IF)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteLet’s quickly review how lw is executed in the single-cycle datapath.We’ll ignore PC incrementing and branching for now.In the Instruction Fetch (IF) step, we read the instruction memory.January 14, 2019 Pipelining 11Instruction Decode (ID)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteThe Instruction Decode (ID) step reads the source register from the register file.January 14, 2019 Pipelining 12Execute (EX)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteThe third step, Execute (EX), computes the effective memory address from the source register and the instruction’s constant field.January 14, 2019 Pipelining 13Memory (MEM)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteThe Memory (MEM) step involves reading the data memory, from the address computed by the ALU.January 14, 2019 Pipelining 14Writeback (WB)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteFinally, in the Writeback (WB) step, the memory value is stored into the destination register.January 14, 2019 Pipelining 15A bunch of lazy functional unitsNotice that each execution step uses a different functional unit. In other words, the main units are idle for most of the 8ns cycle!—The instruction RAM is used for just 2ns at the start of the cycle.—Registers are read once in ID


View Full Document

U of I CS 232 - Lecture notes

Documents in this Course
Goal

Goal

2 pages

Exam 1

Exam 1

5 pages

Exam 1

Exam 1

6 pages

Exam 2

Exam 2

6 pages

Exam 1

Exam 1

5 pages

Load more
Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?