Unformatted text preview:

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http www eecs berkeley edu krste http inst eecs berkeley edu cs152 January 28 2010 CS152 Spring 2010 Last time in Lecture 3 Microcoding became less attractive as gap between RAM and ROM speeds reduced Complex instruction sets difficult to pipeline so difficult to increase performance as gate count grew Iron law explains architecture design space Trade instructions program cycles instruction and time cycle Load Store RISC ISAs designed for efficient pipelined implementations Very similar to vertical microcode Inspired by earlier Cray machines more on these later January 28 2010 CS152 Spring 2010 2 An Ideal Pipeline stage 1 stage 2 stage 3 stage 4 All objects go through the same stages No sharing of resources between any two stages Propagation delay through all pipeline stages is equal The scheduling of an object entering the pipeline is not affected by the objects in other stages These conditions generally hold for industrial assembly lines but instructions depend on each other January 28 2010 CS152 Spring 2010 3 Pipelined MIPS To pipeline MIPS First build MIPS without pipelining with CPI 1 Next add pipeline registers to reduce cycle time while maintaining CPI 1 January 28 2010 CS152 Spring 2010 4 Lecture 3 Unpipelined Datapath for MIPS PCSrc br rind jabs pc 4 RegWrite MemWrite WBSrc 0x4 Add Add clk PC clk addr inst 31 Inst Memory we rs1 rs2 rd1 ws wd rd2 clk we addr ALU GPRs z Imm Ext rdata Data Memory wdata ALU Control OpCode RegDst January 28 2010 ExtSel OpSel BSrc CS152 Spring 2010 zero 5 Lecture 3 Hardwired Control Table Opcode ALU ExtSel BSrc OpSel MemW RegW WBSrc RegDst PCSrc SW sExt16 uExt16 sExt16 sExt16 Reg Imm Imm Imm Imm Func Op Op no no no no yes yes yes yes yes no ALU ALU ALU Mem rd rt rt rt pc 4 pc 4 pc 4 pc 4 pc 4 BEQZz 0 sExt16 0 no no br BEQZz 1 sExt16 no no no no no pc 4 jabs 0 yes no yes PC PC R31 R31 jabs rind rind ALUi ALUiu LW J JAL JR JALR BSrc Reg Imm RegDst rt rd R31 January 28 2010 no no WBSrc ALU Mem PC PCSrc pc 4 br rind jabs CS152 Spring 2010 6 Pipelined Datapath 0x4 Add PC addr rdata Inst Memory IR we rs1 rs2 rd1 ws wd rd2 GPRs ALU Imm Ext we addr rdata Data Memory wdata write fetch decode Reg fetch execute memory back phase phase phase phase phase Clock period can be reduced by dividing the execution of an instruction into multiple cycles tC max tIM tRF tALU tDM tRW tDM probably However CPI will increase unless instructions are pipelined January 28 2010 CS152 Spring 2010 7 Iron Law of Processor Performance Time Instructions Cycles Time Program Program Instruction Cycle Instructions per program depends on source code compiler technology and ISA Cycles per instructions CPI depends upon the ISA and the microarchitecture Time per cycle depends upon the microarchitecture and the base technology Lecture 2 Lecture 3 Lecture 4 January 28 2010 Microarchitecture Microcoded Single cycle unpipelined Pipelined CS152 Spring 2010 CPI 1 1 1 cycle time short long short 8 CPI Examples Microcoded machine 7 cycles 5 cycles Inst 1 10 cycles Inst 2 Time Inst 3 3 instructions 22 cycles CPI 7 33 Unpipelined machine Inst 1 Inst 2 Inst 3 3 instructions 3 cycles CPI 1 Pipelined machine Inst 1 Inst 2 Inst 3 January 28 2010 3 instructions 3 cycles CPI 1 CS152 Spring 2010 9 Technology Assumptions A small amount of very fast memory caches backed up by a large slower memory Fast ALU at least for integers Multiported Register files slower Thus the following timing assumption is reasonable tIM tRF tALU tDM tRW A 5 stage pipeline will be the focus of our detailed design some commercial designs have over 30 pipeline stages to do an integer add January 28 2010 CS152 Spring 2010 10 5 Stage Pipelined Execution 0x4 Add PC addr rdata IR Inst Memory we rs1 rs2 rd1 ws wd rd2 GPRs we addr rdata ALU Data Memory Imm Ext wdata I Fetch Decode Reg Fetch Execute ID IF EX time instruction1 instruction2 instruction3 instruction4 instruction5 January 28 2010 t0 IF1 t1 t2 ID1 EX1 IF2 ID2 IF3 t3 MA1 EX2 ID3 IF4 t4 WB1 MA2 EX3 ID4 IF5 CS152 Spring 2010 Write Back WB Memory MA t5 t6 t7 WB2 MA3 WB3 EX4 MA4 WB4 ID5 EX5 MA5 WB5 11 5 Stage Pipelined Execution Resource Usage Diagram 0x4 Add PC addr rdata we rs1 rs2 rd1 ws wd rd2 GPRs IR Inst Memory we addr rdata ALU Data Memory Imm Ext wdata Resources I Fetch Decode Reg Fetch Execute ID IF EX January 28 2010 time IF ID EX MA WB t0 I1 t1 I2 I1 t2 I3 I2 I1 t3 I4 I3 I2 I1 t4 I5 I4 I3 I2 I1 CS152 Spring 2010 Write Back WB Memory MA t5 t6 t7 I5 I4 I3 I2 I5 I4 I3 I5 I4 I5 12 Pipelined Execution ALU Instructions 0x4 IR Add PC addr IR IR 31 inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 GPRs A ALU Y B we addr rdata Data Memory Imm Ext R wdata wdata MD1 MD2 Not quite correct We need an Instruction Reg IR for each stage January 28 2010 CS152 Spring 2010 13 Pipelined MIPS Datapath without jumps F D E M IR W IR IR 31 0x4 Add RegDst RegWrite PC addr inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 OpSel MemWrite A ALU GPRs Y we addr rdata B Data Memory wdata Imm Ext R wdata MD1 ExtSel WBSrc MD2 BSrc Control Points Need to Be Connected January 28 2010 CS152 Spring 2010 14 Instructions interact with each other in pipeline An instruction in the pipeline may need a resource being used by another instruction in the pipeline structural hazard An instruction may depend on something produced by an earlier instruction Dependence may be for a data value data hazard Dependence may be for the next instruction s address control hazard branches exceptions January 28 2010 CS152 Spring 2010 15 Resolving Structural Hazards Structural hazards occurs when two instruction need same hardware resource at same time Can resolve in hardware by stalling newer instruction till older instruction finished with resource A structural hazard can always be avoided by adding more hardware to design E g if two instructions both need a port to memory at same time could avoid hazard by adding second port to memory Our 5 stage pipe has no structural hazards by design Thanks to MIPS ISA which was designed for pipelining January 28 2010 CS152 Spring 2010 16 Data Hazards r1 r4 r1 0x4 IR Add PC addr 31 inst IR Inst Memory we rs1 rs2 rd1 ws wd rd2 GPRs A ALU Y B we addr rdata Data Memory Imm Ext R wdata wdata MD1 r1 r0 10 r4 r1 17 January 28 2010 IR IR MD2 r1 is stale Oops CS152 …


View Full Document

Berkeley COMPSCI 152 - Lecture 4 - Pipelining

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 4 - Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 4 - Pipelining and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?