U of U CS 6810 - Lecture 8 - Dynamic ILP

Unformatted text preview:

1 Lecture 8: Dynamic ILP • Topics: out-of-order processors (See class notes) • HW3 is posted, due on Tuesday2 An Out-of-Order Processor Implementation Branch prediction and instr fetch R1  R1+R2 R2  R1+R3 BEQZ R2 R3  R1+R2 R1  R3+R2 Instr Fetch Queue Decode & Rename Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 T1 T2 T3 T4 T5 T6 Reorder Buffer (ROB) T1  R1+R2 T2  T1+R3 BEQZ T2 T4  T1+T2 T5  T4+T2 Issue Queue (IQ) ALU ALU ALU Register File R1-R32 Results written to ROB and tags broadcast to IQ3 Design Details - I • Instructions enter the pipeline in order • No need for branch delay slots if prediction happens in time • Instructions leave the pipeline in order – all instructions that enter also get placed in the ROB – the process of an instruction leaving the ROB (in order) is called commit – an instruction commits only if it and all instructions before it have completed successfully (without an exception) • To preserve precise exceptions, a result is written into the register file only when the instruction commits – until then, the result is saved in a temporary register in the ROB4 Design Details - II • Instructions get renamed and placed in the issue queue – some operands are available (T1-T6; R1-R32), while others are being produced by instructions in flight (T1-T6) • As instructions finish, they write results into the ROB (T1-T6) and broadcast the operand tag (T1-T6) to the issue queue – instructions now know if their operands are ready • When a ready instruction issues, it reads its operands from T1-T6 and R1-R32 and executes (out-of-order execution) • Can you have WAW or WAR hazards? By using more names (T1-T6), name dependences can be avoided5 Design Details - III • If instr-3 raises an exception, wait until it reaches the top of the ROB – at this point, R1-R32 contain results for all instructions up to instr-3 – save registers, save PC of instr-3, and service the exception • If branch is a mispredict, flush all instructions after the branch and start on the correct path – mispredicted instrs will not have updated registers (the branch cannot commit until it has completed and the flush happens as soon as the branch completes) • Potential problems: ?6 Managing Register Names Logical Registers R1-R32 Physical Registers P1-P64 R1  R1+R2 R2  R1+R3 BEQZ R2 R3  R1+R2 P33  P1+P2 P34  P33+P3 BEQZ P34 P35  P33+P34 At the start, R1-R32 can be found in P1-P32 Instructions stop entering the pipeline when P64 is assigned What happens on commit? Temporary values are stored in the register file and not the ROB7 The Commit Process • On commit, no copy is required • The register map table is updated – the “committed” value of R1 is now in P33 and not P1 – on an exception, P33 is copied to memory and not P1 • An instruction in the issue queue need not modify its input operand when the producer commits • When instruction-1 commits, we no longer have any use for P1 – it is put in a free pool and a new instruction can now enter the pipeline  for every instr that commits, a new instr can enter the pipeline  number of in-flight instrs is a constant = number of extra (rename) registers8 The Alpha 21264 Out-of-Order Implementation Branch prediction and instr fetch R1  R1+R2 R2  R1+R3 BEQZ R2 R3  R1+R2 R1  R3+R2 Instr Fetch Queue Decode & Rename Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Reorder Buffer (ROB) P33  P1+P2 P34  P33+P3 BEQZ P34 P35  P33+P34 P36  P35+P34 Issue Queue (IQ) ALU ALU ALU Register File P1-P64 Results written to regfile and tags broadcast to IQ Speculative Reg Map R1P36 R2P34 Committed Reg Map R1P1 R2P29 Out-of-Order Loads/Stores Ld R1  [R2] Ld St Ld Ld What if the issue queue also had load/store instructions? Can we continue executing instructions out-of-order? R3  [R4] R5  [R6] R7  [R8] R9[R10]10 Memory Dependence Checking Ld 0x abcdef Ld St Ld Ld 0x abcdef St 0x abcd00 Ld 0x abc000 Ld 0x abcd00 • The issue queue checks for register dependences and executes instructions as soon as registers are ready • Loads/stores access memory as well – must check for RAW, WAW, and WAR hazards for memory as well • Hence, first check for register dependences to compute effective addresses; then check for memory dependences11 Memory Dependence Checking Ld 0x abcdef Ld St Ld Ld 0x abcdef St 0x abcd00 Ld 0x abc000 Ld 0x abcd00 • Load and store addresses are maintained in program order in the Load/Store Queue (LSQ) • Loads can issue if they are guaranteed to not have true dependences with earlier stores • Stores can issue only if we are ready to modify memory (can not recover if an earlier instr raises an exception)12 The Alpha 21264 Out-of-Order Implementation Branch prediction and instr fetch R1  R1+R2 R2  R1+R3 BEQZ R2 R3  R1+R2 R1  R3+R2 LD R4  8[R3] ST R4  8[R1] Instr Fetch Queue Decode & Rename Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Instr 7 Reorder Buffer (ROB) P33  P1+P2 P34  P33+P3 BEQZ P34 P35  P33+P34 P36  P35+P34 P37  8[P35] P37  8[P36] Issue Queue (IQ) ALU ALU ALU Register File P1-P64 Results written to regfile and tags broadcast to IQ P37  [P35 + 8] P37  [P36 + 8] LSQ ALU D-Cache Committed Reg Map R1P1 R2P2 Speculative Reg Map R1P36 R2P3413 Title •


View Full Document

U of U CS 6810 - Lecture 8 - Dynamic ILP

Documents in this Course
Caches

Caches

13 pages

Pipelines

Pipelines

14 pages

Load more
Download Lecture 8 - Dynamic ILP
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 8 - Dynamic ILP and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 8 - Dynamic ILP 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?