DOC PREVIEW
U of I CS 232 - Lecture notes

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

October 8, 2007 ©2003 Craig Zilles (derived from slidesby Howard Huang and David Patterson)1A relevant question Assuming you’ve got:— One washer (takes 30 minutes)— One drier (takes 40 minutes)— One “folder” (takes 20 minutes) It takes 90 minutes to wash, dry, and fold 1 load of laundry.— How long does 4 loads take?October 8, 2007 Pipelining 2The slow way If each load is done sequentially it takes 6 hours30 40 20 30 40 20 30 40 20 30 40 206 PM7 8 91011MidnightTimeOctober 8, 2007 Pipelining 3Laundry Pipelining Start each load as soon as possible— Overlap loads Pipelined laundry takes 3.5 hours6 PM7 8 91011MidnightTime30 40 40 40 40 20October 8, 2007 Pipelining 4Pipelining Lessons Pipelining doesn’t help latency ofsingle load, it helps throughput ofentire workload Pipeline rate limited by slowestpipeline stage Multiple tasks operatingsimultaneously using differentresources Potential speedup = Number pipestages Unbalanced lengths of pipe stagesreduces speedup Time to “fill” pipeline and time to“drain” it reduces speedup6 PM7 8 9Time30 40 40 40 40 20March 17, 2003 Pipelining 5Pipelining is not just Multiprocessing Pipelining does involve parallel processing, but in a specific way. Both multiprocessing and pipelining relate to the processing of multiple“things” using multiple “functional units”— Multiprocessing implies each thing is processed entirely by a singlefunctional unit• e.g., multiple lanes at the supermarket— In pipelining, each thing is broken into a sequence of pieces, whereeach piece is handled by a different (specialized) functional unit.• Supermarket analogy? Pipelining and multiprocessing are not mutually exclusive— Modern processors do both, with multiple pipelines (e.g., superscalar)October 8, 2007 Pipelining 6Pipelining Pipelining is a general-purpose efficiency technique— It is not specific to processors Pipelining is used in:— Assembly lines— Bucket brigades— Fast food restaurants Pipelining is used in other CS disciplines:— Networking— Server software architecture Useful to increase throughput in the presence of long latency— More on that later…October 8, 2007 Pipelining 7Instruction execution review Executing a MIPS instruction can take up to five steps. However, as we saw, not all instructions need all five steps.Store a result in the destination register.WBWritebackRead or write the data memory.MEMMemoryCompute an R-type result or a branch outcome.EXExecuteRead source registers and generate control signals.IDInstruction DecodeRead an instruction from memory.IFInstruction FetchDescriptionNameStepWBMEMEXIDIFlwMEMEXIDIFswWBEXIDIFR-typeEXIDIFbeqSteps requiredInstructionOctober 8, 2007 Pipelining 8Single-cycle datapath diagram4Shiftleft 2PCAddAdd0Mux1PCSrcReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegReadaddressInstructionmemoryInstruction[31-0]I [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWriteSignextend0Mux1ALUSrcResultZeroALUALUOp2ns2ns2ns1ns How long does it take to execute each instruction?October 8, 2007 Pipelining 9Single-cycle review All five execution steps occur in one clock cycle. This means the cycle time must be long enough to accommodate all thesteps of the most complex instruction—a “lw” in our instruction set.— If the register file has a 1ns latency and the memories and ALU have a2ns latency, “lw” will require 8ns.— Thus all instructions will take 8ns to execute. Each hardware element can only be used once per clock cycle.— A “lw” or “sw” must access memory twice (in the IF and MEM stages),so there are separate instruction and data memories.— There are multiple adders, since each instruction increments the PC(IF) and performs another computation (EX). On top of that, branchesalso need to compute a target address.October 8, 2007 Pipelining 10Example: Instruction Fetch (IF)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWrite Let’s quickly review how lw is executed in the single-cycle datapath. We’ll ignore PC incrementing and branching for now. In the Instruction Fetch (IF) step, we read the instruction memory.October 8, 2007 Pipelining 11Instruction Decode (ID)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWrite The Instruction Decode (ID) step reads the source register from theregister file.October 8, 2007 Pipelining 12Execute (EX)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWrite The third step, Execute (EX), computes the effective memory addressfrom the source register and the instruction’s constant field.October 8, 2007 Pipelining 13Memory (MEM)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWrite The Memory (MEM) step involves reading the data memory, from theaddress computed by the ALU.October 8, 2007 Pipelining 14Writeback (WB)ReadaddressInstructionmemoryInstruction[31-0]ReadaddressWriteaddressWritedataDatamemoryReaddataMemWriteMemRead1Mux0MemToRegSignextend0Mux1ALUSrcResultZeroALUALUOpI [15 - 0]I [25 - 21]I [20 - 16]I [15 - 11]0Mux1RegDstReadregister 1Readregister 2WriteregisterWritedataReaddata 2Readdata 1RegistersRegWrite Finally, in the Writeback (WB) step, the memory value is stored into thedestination register.October 8, 2007 Pipelining 15A bunch of lazy functional units Notice that each execution step uses a different functional unit. In


View Full Document

U of I CS 232 - Lecture notes

Documents in this Course
Goal

Goal

2 pages

Exam 1

Exam 1

5 pages

Exam 1

Exam 1

6 pages

Exam 2

Exam 2

6 pages

Exam 1

Exam 1

5 pages

Load more
Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?