CS 152 Computer Architecture and Engineering Lecture 4 Pipelining Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http www eecs berkeley edu krste http inst eecs berkeley edu cs152 Last time in Lecture 3 Microcoding became less attractive as gap between RAM and ROM speeds reduced Complex instruction sets difficult to pipeline so difficult to increase performance as gate count grew Iron law explains architecture design space Trade instructions program cycles instruction and time cycle Load Store RISC ISAs designed for efficient pipelined implementations Very similar to vertical microcode Inspired by earlier Cray machines 2 3 2009 CS152 Spring 09 2 Iron Law of Processor Performance Time Instructions Cycles Time Program Program Instruction Cycle Instructions per program depends on source code compiler technology and ISA Cycles per instructions CPI depends upon the ISA and the microarchitecture Time per cycle depends upon the microarchitecture and the base technology This lecture Microarchitecture Microcoded Single cycle unpipelined Pipelined 2 3 2009 CPI 1 1 1 cycle time short long short 3 CS152 Spring 09 An Ideal Pipeline stage 1 stage 2 stage 3 stage 4 All objects go through the same stages No sharing of resources between any two stages Propagation delay through all pipeline stages is equal The scheduling of an object entering the pipeline is not affected by the objects in other stages 2 3 2009 These conditions generally hold for industrial assembly lines But can an instruction pipeline satisfy the last condition CS152 Spring 09 4 Pipelined MIPS To pipeline MIPS First build MIPS without pipelining with CPI 1 Next add pipeline registers to reduce cycle time while maintaining CPI 1 2 3 2009 5 CS152 Spring 09 Pipelined Datapath 0x4 Add PC addr rdata Inst Memory IR we rs1 rs2 rd1 ws wd rd2 GPRs ALU we addr rdata Data Memory Imm Ext wdata write back phase Clock period can be reduced by dividing the execution of an instruction into multiple cycles fetch phase decode Reg fetch phase execute phase memory phase tC max tIM tRF tALU tDM tRW tDM probably However CPI will increase unless instructions are pipelined 2 3 2009 CS152 Spring 09 6 Technology Assumptions A small amount of very fast memory caches backed up by a large slower memory Fast ALU at least for integers Multiported Register files slower Thus the following timing assumption is reasonable tIM tRF tALU tDM tRW A 5 stage pipelined Harvard architecture will be the focus of our detailed design 2 3 2009 7 CS152 Spring 09 5 Stage Pipelined Execution 0x4 Add PC addr rdata we rs1 rs2 rd1 ws wd rd2 GPRs IR Inst Memory I Fetch IF Data Memory Imm Ext wdata Decode Reg Fetch Execute ID EX time instruction1 instruction2 instruction3 instruction4 instruction5 2 3 2009 we addr rdata ALU t0 IF1 t1 t2 ID1 EX1 IF2 ID2 IF3 t3 MA1 EX2 ID3 IF4 CS152 Spring 09 t4 WB1 MA2 EX3 ID4 IF5 Write Back WB Memory MA t5 t6 t7 WB2 MA3 WB3 EX4 MA4 WB4 ID5 EX5 MA5 WB5 8 5 Stage Pipelined Execution Resource Usage Diagram 0x4 Add PC addr rdata we rs1 rs2 rd1 ws wd rd2 GPRs IR Inst Memory Resources I Fetch IF we addr rdata ALU Data Memory Imm Ext wdata Decode Reg Fetch Execute ID EX time IF ID EX MA WB t0 I1 2 3 2009 t1 I2 I1 t2 I3 I2 I1 t3 I4 I3 I2 I1 t4 I5 I4 I3 I2 I1 Write Back WB Memory MA t5 t6 t7 I5 I4 I3 I2 I5 I4 I3 I5 I4 I5 9 CS152 Spring 09 Pipelined Execution ALU Instructions 0x4 IR Add PC addr IR IR 31 inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 A ALU GPRs Y B we addr rdata Data Memory Imm Ext R wdata wdata MD1 MD2 Not quite correct We need an Instruction Reg IR for each stage 2 3 2009 CS152 Spring 09 10 Pipelined MIPS Datapath without jumps F D E M IR W IR IR 31 0x4 Add RegDst RegWrite PC addr inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 OpSel MemWrite A ALU GPRs Y we addr rdata B Data Memory wdata Imm Ext R wdata MD1 ExtSel WBSrc MD2 BSrc Control Points Need to Be Connected 2 3 2009 CS152 Spring 09 11 How Instructions can Interact with each other in a pipeline An instruction in the pipeline may need a resource being used by another instruction in the pipeline structural hazard An instruction may depend on something produced by an earlier instruction Dependence may be for a data value data hazard Dependence may be for the next instruction s address control hazard branches exceptions 2 3 2009 CS152 Spring 09 12 Data Hazards r1 r4 r1 0x4 IR Add PC addr IR IR 31 inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 A ALU GPRs Y B rdata Data Memory Imm Ext R wdata wdata MD1 r1 r0 10 r4 r1 17 2 3 2009 we addr MD2 r1 is stale Oops CS152 Spring 09 13 CS152 Administrivia PS 1 due Tuesday Feb 10 in class Section covering PS 1 on Wednesday Feb 11 Room time TBD First Quiz on Thursday Feb 12 In class closed book no computers or calculators Covers lectures 1 5 this week s lectures Lecture 7 Tuesday Feb 17 in 320 Soda Lecture 8 Thursday Feb 19 back in 306 Soda See website for full schedule 2 3 2009 CS152 Spring 09 14 Resolving Data Hazards 1 Strategy 1 Wait for the result to be available by freezing earlier pipeline stages interlocks 2 3 2009 15 CS152 Spring 09 Feedback to Resolve Hazards FB1 FB2 stage 1 FB3 stage 2 FB4 stage 3 stage 4 Later stages provide dependence information to earlier stages which can stall or kill instructions Controlling a pipeline in this manner works provided the instruction at stage i 1 can complete without any interference from instructions in stages 1 to i otherwise deadlocks may occur 2 3 2009 CS152 Spring 09 16 Interlocks to resolve Data Hazards Stall Condition 0x4 nop Add PC addr IR IR IR 31 we rs1 rs2 rd1 ws wd rd2 inst IR Inst Memory A GPRs we addr Y ALU B rdata Data Memory Imm Ext r1 r0 10 r4 r1 17 wdata MD1 2 3 2009 R wdata MD2 17 CS152 Spring 09 Stalled Stages and Pipeline Bubbles time t0 t1 t2 t3 t4 t5 I1 r1 r0 10 IF1 ID1 EX1 MA1 WB1 I2 r4 r1 17 IF2 ID2 ID2 ID2 ID2 I3 IF3 IF3 IF3 IF3 I4 stalled stages I5 Resource Usage IF ID EX MA WB time t0 t1 I1 I2 I1 t2 I3 I2 I1 t3 I3 I2 nop I1 t4 I3 I2 nop nop I1 t5 I3 I2 nop nop nop t6 CS152 Spring 09 EX2 MA2 WB2 ID3 EX3 MA3 WB3 IF4 ID4 EX4 MA4 WB4 IF5 ID5 EX5 MA5 WB5 t6 I4 I3 I2 nop nop nop 2 3 2009 t7 t7 I5 I4 I3 I2 nop I5 I4 I3 I2 I5 I4 I3 I5 I4 pipeline bubble 18 …
View Full Document
Unlocking...