Administrivia Finish reading Chapter 2 of H P First exam scheduled for this Thursday CMSC 411 Computer Systems Architecture Lecture 10 Instruction Level Parallelism cont on Units 1 3 Wanli will be giving it Alan Sussman als cs umd edu l d d 2 CMSC 411 10 from Patterson From H P Figure 2 9 Tomasulo Organization Outline FP Registers From Mem FP Op Queue Load Buffers ILP p techniques q to increase ILP Compiler Loop Unrolling Static Branch Prediction Dynamic Branch Prediction Overcoming Data Hazards with Dynamic Scheduling Tomasulo Algorithm Conclusion Load1 Load2 Load3 Load4 Load5 Load6 Store Buffers Add1 Add2 Add3 Mult1 Mult2 FP adders Reservation Stations To Mem FP multipliers Common Data Bus CDB CMSC 411 8 from Patterson 3 CMSC 411 10 from Patterson 4 Three Stages of Tomasulo Algorithm Reservation Station Components Op Operation to perform in the unit e g or 1 Issue get instruction from FP Op Queue Vj Vk Value of Source operands If reservation station free no structural hazard control issues instr sends operands renames registers Store buffers have V field result to be stored 2 Execute operate on operands EX Qj Q Qk Reservation stations p producing g source registers g value to be written When both operands ready then execute if not ready watch Common Data Bus for result 3 Write result finish execution WB Note Qj Qk 0 ready Write on Common Data Bus to all awaiting units mark reservation station available S Store o e bu buffers eso only y have a eQ Qi for o RS Sp producing oduc g result esu Busy Indicates reservation station or FU is busy Normal data bus data destination go to bus Also Common data bus data source come from bus Register result status table Indicates which functional unit will write each register if one exists Blank when no pending instructions that will write that register 64 bits of data 4 bits of Functional Unit source address Write if matches expected Functional Unit produces result Does D th the b broadcast d t 5 CMSC 411 10 from Patterson Tomasulo Example Tomasulo Example Cycle 1 Instruction stream Instruction status Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34 45 F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2 Time Name Busy Add1 No No Add2 FU count Add3 No down Mult1 No Mult2 No Register result status Clock 0 Instruction status Exec Write Issue Comp Result Busy Address Load1 Load2 Load3 Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 No No No 3 Load Buffers Reservation Stations Op S1 Vj S2 Vk RS Qj F2 F4 F6 F8 j 34 45 F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2 Exec Write Issue Comp Result Time Name Busy Add1 No No Add2 Add3 No Mult1 No Mult2 No F10 F12 F30 FU Register result status Clock 1 FU Busy Address 1 Reservation Stations RS Qk 3 FP Adder R S R S 2 FP Mult R S F0 6 CMSC 411 10 from Patterson Load1 Load2 Load3 Op S1 Vj S2 Vk F0 F2 F4 RS Qj RS Qk F6 F8 Yes No No 34 R2 F10 F12 F30 Load1 Clock cycle counter CMSC 411 10 from Patterson 7 CMSC 411 10 from Patterson 8 Tomasulo Example Cycle 2 Instruction status Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34 45 F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2 Exec Write Issue Comp Result Time Name Busy Add1 No No Add2 Add3 No Mult1 No Mult2 No Register result status Clock Op F0 FU 2 Busy Address 1 2 Reservation Stations Load1 Load2 Load3 S1 Vj F2 S2 Vk RS Qj F4 Load2 Tomasulo Example Cycle 3 Instruction status F6 Yes Yes No Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 34 R2 45 R3 F8 F10 F12 F30 Load1 k R2 R3 F4 F2 F6 F2 Reservation Stations 3 4 4 S1 S2 4 FU F0 F2 Mult1 Load2 F0 FU M A1 result of first load RS F4 F6 Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 45 R3 F2 F4 F10 F10 F12 RS Qk F6 F8 F30 Load1 10 j 34 45 F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2 Exec Write Issue Comp Result 1 2 3 4 5 3 4 4 5 S1 S2 Busy Address Load1 Load2 Load3 RS Vj Vk Qj Time Name Busy Op 2 Add1 Yes SUBD M A1 M A2 No Add2 Add3 No 10 Mult1 Yes MULTD M A2 R F4 Mult2 Yes DIVD M A1 Mult1 F12 F30 M A1 Add1 Register result status Clock 5 Load2 completing what is waiting for Load2 CMSC 411 10 from Patterson RS Qj 34 R2 45 R3 R F4 Load2 Mult1 Load2 Reservation Stations RS F8 S2 Vk Yes Yes No Tomasulo Example Cycle 5 Instruction status No Yes No S1 Vj Load1 Load2 Load3 CMSC 411 10 from Patterson Busy Address Load1 Load2 Load3 Busy Address 3 Note registers g names are removed renamed in Reservation Stations MULT issued Load1 completing what is waiting for Load1 Vj Vk Qj Qk Time Name Busy Op Add1 Yes SUBD M A1 Load2 No Add2 Add3 No R F4 Load2 Mult1 Yes MULTD Mult2 No Register result status Clock 1 2 3 9 Exec Write Issue Comp Result 1 2 3 4 Register result status Clock 3 Tomasulo Example Cycle 4 j 34 45 F2 F6 F0 F8 Exec Write Issue Comp Result Time Name Busy Op Add1 No No Add2 Add3 No Mult1 Yes MULTD Mult2 No CMSC 411 10 from Patterson Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 k R2 R3 F4 F2 F6 F2 Reservation Stations RS Qk Note Can have multiple loads outstanding Instruction status j 34 45 F2 F6 F0 F8 FU F0 F2 Mult1 M A2 F4 F6 No No No RS Qk F8 F10 F12 F30 M A1 Add1 Mult2 Timer starts down for f Add1 Mult1 11 CMSC 411 10 from Patterson 12 Tomasulo Example Cycle 6 Instruction status Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 j 34 45 F2 F6 F0 F8 k R2 R3 F4 F2 F6 F2 Exec Write Issue Comp Result 1 2 3 4 5 6 Reservation Stations 3 4 S1 Busy Address 4 5 Load1 Load2 Load3 S2 RS Vj Vk Qj Time Name Busy Op 1 Add1 Yes SUBD M A1 M A2 Add1 Add2 Yes ADDD M A2 Add3 No 9 Mult1 Yes MULTD M A2 R F4 Mult2 Yes DIVD M A1 Mult1 Register result status Clock F0 FU 6 F2 F4 Mult1 M A2 Tomasulo Example Cycle 7 Instruction status F6 Add2 Instruction LD F6 LD F2 MULTD F0 SUBD F8 DIVD F10 ADDD F6 No No No F8 F10 F12 F30 k R2 R3 F4 F2 F6 F2 Add1 Mult2 Reservation Stations 3 4 4 5 7 8 S1 S2 RS 8 FU F0 …
View Full Document