Unformatted text preview:

COMP 411 Computer Organization Pipelining An Overview Don Porter Lecture 16 Covered superficially on the final exam 1 COMP 411 Computer Organization Pipelining Between 411 problems sets I haven t had a minute to do laundry Now that s what I call dirty laundry COMP 411 Computer Organization Laundry Example INPUT dirty laundry OUTPUT 2 more weeks Device Washer Function Fill Agitate Spin WasherPD 30 mins Device Dryer Function Heat Spin DryerPD 60 mins COMP 411 Computer Organization Laundry One Load at a Time Everyone knows that the real reason one puts off doing laundry so long is not because we procrastinate are lazy or even have better things to do The fact is doing laundry one load at a time is not smart Step 1 Step 2 Total WasherPD DryerPD 90 mins COMP 411 Computer Organization Laundry Doing N Loads Here s how one would do laundry the unpipelined way Step 1 Step 2 Step 3 Step 4 Total N WasherPD DryerPD N 90 mins COMP 411 Computer Organization Laundry Doing N Loads Step 1 Here s how to pipeline the laundry process Much more efficient Actually it s more like N 60 30 Actually it s more like N 60 30 if we account for the startup if we account for the startup time i e filling up the pipeline time i e filling up the pipeline correctly When doing pipeline correctly When doing pipeline analysis we re mostly analysis we re mostly interested in the steady state interested in the steady state where we assume we have an where we assume we have an infinite supply of inputs infinite supply of inputs Step 2 Step 3 Total N Max WasherPD DryerPD N 60 mins COMP 411 Computer Organization Recall Our Performance Measures Latency Delay from input to corresponding output Unpipelined Laundry mins Pipelined Laundry mins 120 90 Assuming that the wash is started as soon as possible and waits wet in the washer until dryer is available Throughput Rate at which inputs or outputs are processed Unpipelined Laundry outputs min Pipelined Laundry outputs min 1 90 1 60 Even though we increase latency it takes less time per load COMP 411 Computer Organization Pipelining Summary Higher throughput than combinational system Different parts of the logic work on different parts of the Advantages problem Disadvantages Generally increases latency Only as good as the weakest link recall Amdahl s Law often called the pipeline s BOTTLENECK COMP 411 Computer Organization Review of CPU Performance MIPS Freq CPI MIPS Millions of Instr Second Freq Clock Frequency MHz CPI Clocks per Instruction To Increase MIPS 1 DECREASE CPI 2 INCREASE Freq hence RISC simplicity reduces CPI to 1 0 CPI below 1 0 State of the art multiple instruction issue Freq limited by delay along longest combinational path PIPELINING is the key to improving performance Where Are the Bottlenecks COMP 411 Computer Organization Pipelining goal Break LONG combinational paths memories ALU in separate stages 0x80000000 0x80000040 0x80000080 PC 31 29 J 25 0 00 JT PCSEL 6 5 4 3 2 BT 1 0 00 PC 4 IF ID RF ALU MEM WB A Instruction Memory D J 25 0 RESET Z N V C IRQ Control Logic PCSEL WASEL SEXT BSEL WDSEL ALUFN Wr WERF ASEL Rs 25 21 Rt 20 16 WA RA1 WA RD1 Register File RA2 RD2 WD WE WERF SEXT SEXT shamt 10 6 16 0 1 2 ASEL 1 0 BSEL WASEL Rd 15 11 Rt 20 16 31 27 0 1 2 3 Imm 15 0 JT x4 BT Wr WD R W Data Memory Adr RD ALUFN A ALU B ZVN C PC 4 0 1 2 WDSEL COMP 411 Computer Organization Goal 5 Stage Pipeline GOAL Maintain nearly 1 0 CPI but increase clock speed to barely include slowest components mems regfile ALU APPROACH structure processor as 5 stage pipeline IF ID RF ALU MEM WB Instruction Fetch stage Maintains PC fetches one instruction per cycle and passes it to Instruction Decode Register File stage Decode control lines and select source operands ALU stage Performs specified operation passes result to Memory stage If it s a lw use ALU result as an address pass mem data or ALU result if not lw to Write Back stage writes result back into register file 5 Stage miniMIPS COMP 411 Computer Organization Omits some details 0x80000000 0x80000040 0x80000080 PC 31 29 J 25 0 00 JT BT 1 PCSEL 6 5 4 3 2 0 00 00 PC 4 PCREG Instruction Memory D A IRREG J 25 0 Instruction Fetch Imm 15 0 SEXT JT SEXT Rs 25 21 Rt 20 16 RA1 WA RD1 RA2 RD2 Register File BZ shamt 10 6 16 0 1 2 ASEL 1 0 BSEL x4 BT A A ALUFN ALU B B PCALU 00 IRALU WDALU ALU Memory PCMEM 00 PCWB 00 IRMEM IRWB ZVN C PC 4 YMEM YWB WDMEM Adr WD R W Data Memory RD Wr Rt 20 16 Rd 15 11 31 27 0 1 2 3 WASEL 0 1 2 WDSEL WERF WE WA WA Register File WD Register File Write Back IF ID RF ALU MEM WB Improve performance by increasing instruction throughput Pipelining COMP 411 Computer Organization 2 4 6 8 10 12 14 16 18 Program execution order in instructions Time lw 1 100 0 lw 2 200 0 lw 3 300 0 Program execution order in instructions Time lw 1 100 0 Instruction fetch Reg ALU Data access Reg 8 ns Instruction fetch Reg ALU Data access Reg 8 ns Instruction fetch 8 ns 2 4 6 8 10 12 14 Instruction fetch Reg ALU Data access Reg lw 2 200 0 2 ns Instruction fetch Reg ALU Data access Reg lw 3 300 0 2 ns Instruction fetch Reg ALU Data access Reg 2 ns 2 ns 2 ns 2 ns 2 ns Ideal speedup is number of stages in pipeline Do we achieve this COMP 411 Computer Organization Pipelining What makes it easy all instructions are the same length just a few instruction formats memory operands appear only in loads and stores What makes it hard structural hazards suppose we had only one memory control hazards need to worry about branch instructions data hazards an instruction depends on a previous instruction Net effect Individual instructions still take the same number of cycles But improved throughput by increasing the number of simultaneously executing instructions Data Hazards COMP 411 Computer Organization Problem with starting next instruction before first is finished dependencies that go backward in time are data hazards Time in clock cycles Value of register 2 CC 1 10 CC 2 10 CC 3 10 CC 4 10 CC 5 10 20 CC 6 20 CC 7 20 CC 8 20 CC 9 20 sub 2 1 3 IM Reg DM Reg and 12 2 5 IM DM Reg Reg IM Program execution order in instructions or 13 6 2 add 14 2 2 sw 15 100 2 Reg DM Reg IM Reg DM Reg IM Reg DM Reg COMP 411 Computer Organization Software Solution Have compiler guarantee no hazards Where do we insert the nops Between producing …


View Full Document

UNC-Chapel Hill COMP 411 - Pipelining: An Overview

Download Pipelining: An Overview
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pipelining: An Overview and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pipelining: An Overview and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?