CS61C Machine Structures Lecture 24 Review Pipelined Execution November 29 2000 David Patterson http www inst eecs berkeley edu cs61c CS61C L24 1 Steps in Executing MIPS 1 IFetch Fetch Instruction Increment PC 2 Decode Instruction Read Registers 3 Execute Mem ref Calculate Address Arith log Perform Operation 4 Memory Load Read Data from Memory Store Write Data to Memory 5 Write Back Write Data to Register CS61C L24 2 Pipelined Execution Representation Time IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB IFtch Dcd Exec Mem WB Every instruction must take same number of steps also called pipeline stages so some will go idle sometimes CS61C L24 3 ALU Data memory rd rs rt registers PC instruction memory Review Datapath for MIPS imm 4 Stage 1 1 Instruction Fetch Stage 2 Stage 3Stage 4 Stage 5 5 Write 2 Decode 3 Execute 4 Memory Back Register Read Use datapath figure to represent pipeline IFtch Dcd Exec Mem WB CS61C L24 Reg ALU I D Reg 4 Problems for Computers Limits to pipelining Hazards prevent next instruction from executing during its designated clock cycle Structural hazards HW cannot support this combination of instructions e g read instruction and data from memory Control hazards Pipelining of branches other instructions stall the pipeline until the hazard bubbles in the pipeline Data hazards Instruction depends on result of prior instruction still in the pipeline read and write same data CS61C L24 5 Structural Hazard 1 Single Memory 1 2 Time clock cycles ALU I n I D Reg Reg Load s I D Reg Reg t Instr 1 r I D Reg Reg Instr 2 O I D Reg Reg Instr 3 r I D Reg Reg d Instr 4 e r Read same memory twice in same clock cycle ALU ALU ALU ALU CS61C L24 6 Structural Hazard 1 Single Memory 2 2 Solution infeasible and inefficient to create second main memory so simulate this by having two Level 1 Caches have both an L1 Instruction Cache and an L1 Data Cache need more complex hardware to control when both caches miss CS61C L24 7 Structural Hazard 2 Registers 1 2 Reg Reg D Reg I Reg D Reg I Reg D Reg I Reg ALU I D ALU Reg ALU I ALU O Instr 2 r Instr 3 d e Instr 4 r Time clock cycles ALU I n s t Load r Instr 1 D Reg Read and write registers simultaneously CS61C L24 8 Structural Hazard 2 Registers 2 2 Solution Build registers with multiple ports so can both read and write at the same time What if read and write same register Design to that it writes in first half of clock cycle read in second half of clock cycle Thus will read what is written reading the new contents CS61C L24 9 Data Hazards 1 2 Consider the following sequence of instructions add t0 t1 t2 sub t4 t0 t3 and t5 t0 t6 or t7 t0 t8 xor t9 t0 t10 CS61C L24 10 Data Hazards 2 2 Dependencies backwards in time are hazards Reg D Reg Reg D Reg I Reg D Reg I Reg ALU CS61C L24 D ALU r or t7 t0 t8 d e xor t9 t0 t10 r I WB ALU O and t5 t0 t6 MEM ALU ALU Time clock cycles I n IF ID RF EX s add t0 t1 t2 I Reg t I Reg r sub t4 t0 t3 D Reg 11 Data Hazard Solution Forwarding result from one stage to another Forward Reg Reg D Reg I Reg D Reg I Reg D Reg I Reg ALU xor t9 t0 t10 I D ALU or t7 t0 t8 Reg WB ALU and t5 t0 t6 I EX MEM ALU sub t4 t0 t3 ID RF ALU add t0 t1 t2 IF D or hazard solved by register hardware CS61C L24 Reg 12 Data Hazard Loads 1 2 Dependencies backwards in time are hazards I Reg I sub t3 t0 t2 EX MEM WB D Reg Reg ALU ID RF ALU lw t0 0 t1 IF D Reg Can t solve with forwarding Must stall instruction dependent on load then forward more hardware CS61C L24 13 Data Hazard Loads 2 2 Hardware must insert no op in pipeline CS61C L24 I WB D Reg Reg bub ble I D bub ble Reg bub ble I Reg D Reg Reg ALU or t7 t0 t6 Reg MEM ALU and t5 t0 t4 I EX ALU sub t3 t0 t2 ID RF ALU lw t0 0 t1 IF D 14 Administrivia Rest of 61C of 61C slower pace Rest F 12 1 Review Caches TLB VM Section 7 5 M 12 4 Deadline to correct your grade record W 12 6 Review Interrupts A 7 Feedback lab F 12 8 61C Summary Your Cal heritage HKN Course Evaluation Sun 12 10 Tues 12 12 Final Review 2PM 155 Dwinelle Final 5PM 1 Pimintel Final Just bring pencils leave home back packs cell phones calculators Will check that notes are handwritten CS61C L24 15 Control Hazard Branching 1 6 Suppose we put branch decisionmaking hardware in ALU stage then two more instructions after the branch will always be fetched whether or not the branch is taken Desired functionality of a branch if we do not take the branch don t waste any time and continue executing normally if we take the branch don t execute any instructions after the branch just go to the desired label CS61C L24 16 Control Hazard Branching 2 6 Initial Solution Stall until decision is made insert no op instructions those that accomplish nothing just take time Drawback branches take 3 clock cycles each assuming comparator is put in ALU stage CS61C L24 17 Control Hazard Branching 3 6 Optimization 1 move comparator up to Stage 2 as soon as instruction is decoded Opcode identifies is as a branch immediately make a decision and set the value of the PC if necessary Benefit since branch is complete in Stage 2 only one unnecessary instruction is fetched so only one no op is needed Side Note This means that branches are idle in Stages 3 4 and 5 CS61C L24 18 Control Hazard Branching 4 6 I Insert a single no op bubble Time clock cycles Beq Reg I D Reg Reg ALU Add I ALU n s t r D Reg ALU O Load bub D Reg Reg I ble r d e Impact 2 clock cycles per branch r instruction slow CS61C L24 19 Forwarding and Moving Branch Decision Forwarding bypassing currently affects Execution stage Instead of using value from register read in Decode Stage use value from ALU output or Memory output Moving branch decision from Execution Stage to Decode Stage means forwarding bypassing must be replicated in Decode Stage for branches I e Code below must still work addiu s1 s1 4 beq s1 s2 Exit CS61C L24 20 Control Hazard Branching 5 6 Optimization 2 Redefine branches Old definition if we take the branch none of …
View Full Document
Unlocking...