DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 18

This preview shows page 1-2-3-4-5-34-35-36-37-38-69-70-71-72-73 out of 73 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 73 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 18 Dynamic Scheduling Cont Speculation and ILP November 2 2001 John Kubiatowicz http cs berkeley edu kubitron lecture slides http www inst eecs berkeley edu cs152 11 02 01 UCB Fall 2001 CS152 Kubiatowicz The Big Picture Where are We Now The Five Classic Components of a Computer Processor Input Control Memory Datapath Output Today s Topics Recap last lecture Hardware loop unrolling with Tomasulo algorithm Administrivia Speculation branch prediction Reorder buffers 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Registers FP FPMult Mult FP FPMult Mult FP FPDivide Divide FP FPAdd Add Integer Integer SCOREBOARD SCOREBOARD 11 02 01 UCB Fall 2001 Functional Units Review Scoreboard Architecture CDC 6600 Memory CS152 Kubiatowicz Review Four Stages of Scoreboard Control Issue decode instructions check for structural hazards Instructions issued in program order for hazard checking Don t issue if structural hazard Don t issue if instruction is output dependent on any previously issued but uncompleted instruction no WAW hazards Read operands wait until no data hazards then read operands All real dependencies RAW hazards resolved in this stage No forwarding of data in this model Execution operate on operands EX The functional unit begins execution upon receiving operands When the result is ready it notifies the scoreboard that it has completed execution Write result finish execution WB Stall until no WAR hazards with previous instructions Example DIVD ADDD SUBD F0 F2 F4 F10 F0 F8 F8 F8 F14 CDC 6600 scoreboard would stall SUBD until ADDD reads operands 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Review Tomasulo Organization FP Registers From Mem FP Op Queue Load Buffers Load1 Load2 Load3 Load4 Load5 Load6 Store Buffers Add1 Add2 Add3 Mult1 Mult2 Reservation Stations FP FP FPadders adders FPmultipliers multipliers Common Data Bus CDB 11 02 01 UCB Fall 2001 To Mem CS152 Kubiatowicz Recall Reservation Station Components Op Operation to perform in the unit e g or Vj Vk Value of Source operands Store buffers has V field result to be stored Qj Qk Reservation stations producing source registers value to be written Note No ready flags as in Scoreboard Qj Qk 0 ready Store buffers only have Qi for RS producing result Busy Indicates reservation station or FU is busy Register result status Or Rename Table Mapping from user visible registers to reservation stations or value 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Recall Three Stages of Tomasulo Algorithm 1 Issue get instruction from FP Op Queue If reservation station free no structural hazard control issues instr sends operands renames registers 2 Execution operate on operands EX When both operands ready then execute if not ready watch Common Data Bus for result 3 Write result finish execution WB Write on Common Data Bus to all awaiting units mark reservation station available Normal data bus data destination go to bus Common data bus data source come from bus 64 bits of data 4 bits of Functional Unit source address Write if matches expected Functional Unit produces result Does the broadcast 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Recall Comparison of two techniques Instruction status Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34 45 F2 F6 F0 F8 Read Exec Write k Issue Oper Comp Result R2 R3 F4 F2 F6 F2 1 5 6 7 8 13 2 6 9 9 21 14 3 7 19 11 61 16 4 8 20 12 62 22 Exec Write Issue ComplResult 1 2 3 4 5 6 3 4 15 7 56 10 4 5 16 8 57 11 In order issue Out of order execution Out of order Completion Problem with precise Interrupts 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Tomasulo Loop Example Loop MULTD SD F4 SUBI BNEZ LD F4 0 R1 R1 F0 0 F0 F2 R1 R1 8 Loop R1 Assume Multiply takes 4 clocks Assume first load takes 8 clocks cache miss second load takes 1 clock hit To be clear will show clocks for SUBI BNEZ Reality integer instructions ahead 11 02 01 UCB Fall 2001 CS152 Kubiatowicz Loop Example Instruction status ITER Instruction 1 1 1 2 2 2 LD MULTD SD LD MULTD SD F0 F4 F4 F0 F4 F4 j k 0 F0 0 0 F0 0 R1 F2 R1 R1 F2 R1 Reservation Stations Time Name Busy Add1 No Add2 No Add3 No Mult1 No Mult2 No Op Vj Exec Write Issue CompResult S1 Vk S2 Qj RS Qk Busy Addr Load1 Load2 Load3 Store1 Store2 Store3 No No No No No No Code LD MULTD SD SUBI BNEZ F0 F4 F4 R1 R1 Fu 0 F0 0 R1 Loop R1 F2 R1 8 F30 Register result status Clock 0 11 02 01 F0 R1 80 F2 F4 F6 F8 F10 F12 Fu Rename Table UCB Fall 2001 CS152 Kubiatowicz Loop Example Cycle 1 Instruction status ITER Instruction 1 1 1 2 2 2 LD MULTD SD LD MULTD SD F0 F4 F4 F0 F4 F4 j k 0 F0 0 0 F0 0 R1 F2 R1 R1 F2 R1 Reservation Stations Time Name Busy Add1 No Add2 No Add3 No Mult1 No Mult2 No Op Vj Exec Write Issue CompResult 1 S1 Vk S2 Qj RS Qk Busy Addr Fu Load1 Load2 Load3 Store1 Store2 Store3 Yes No No No No No 80 Code LD MULTD SD SUBI BNEZ F0 F4 F4 R1 R1 0 F0 0 R1 Loop R1 F2 R1 8 F30 Register result status Clock 1 11 02 01 R1 80 F0 F2 F4 F6 F8 F10 F12 Fu Load1 UCB Fall 2001 CS152 Kubiatowicz Loop Example Cycle 2 Instruction status ITER Instruction 1 1 1 2 2 2 LD MULTD SD LD MULTD SD F0 F4 F4 F0 F4 F4 j k 0 F0 0 0 F0 0 R1 F2 R1 R1 F2 R1 Reservation Stations Time Name Busy Op Add1 No Add2 No Add3 No Mult1 Yes Multd Mult2 No Vj Exec Write Issue CompResult 1 2 S1 Vk S2 Qj RS Qk R F2 Load1 Busy Addr Fu Load1 Load2 Load3 Store1 Store2 Store3 Yes No No No No No 80 Code LD MULTD SD SUBI BNEZ F0 F4 F4 R1 R1 0 F0 0 R1 Loop R1 F2 R1 8 F30 Register result status Clock 2 11 02 01 R1 80 F0 Fu Load1 F2 F4 F6 F8 F10 F12 Mult1 UCB Fall 2001 CS152 Kubiatowicz Loop Example Cycle 3 Instruction status ITER Instruction 1 1 1 2 2 2 LD MULTD SD LD MULTD SD F0 F4 F4 F0 F4 F4 j k 0 F0 0 0 F0 0 R1 F2 R1 R1 F2 R1 Reservation Stations Time Name Busy Op Add1 No Add2 No Add3 No Mult1 Yes Multd Mult2 No Vj Exec Write Issue CompResult 1 2 3 S1 Vk S2 Qj RS Qk R F2 Load1 Busy Addr Fu Load1 Load2 Load3 Store1 Store2 Store3 Yes No No Yes No No 80 80 Mult1 Code LD MULTD SD SUBI BNEZ F0 F4 F4 R1 R1 0 F0 0 R1 Loop R1 F2 R1 8 F30 Register result status Clock 3 R1 80 F0 Fu Load1 …


View Full Document

Berkeley COMPSCI 152 - Lecture 18

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 18 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 18 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?