1Lecture 9: Dynamic ILP• Topics: out-of-order processors(Sections 2.3-2.6, class notes)2An Out-of-Order Processor ImplementationBranch predictionand instr fetchR1 R1+R2R2 R1+R3BEQZ R2R3 R1+R2R1 R3+R2Instr Fetch QueueDecode &RenameInstr 1Instr 2Instr 3Instr 4Instr 5Instr 6T1T2T3T4T5T6Reorder Buffer (ROB)T1 R1+R2T2 T1+R3BEQZ T2T4 T1+T2T5 T4+T2Issue Queue (IQ)ALU ALU ALURegister FileR1-R32Results written toROB and tagsbroadcast to IQ3Design Details - I• Instructions enter the pipeline in order• No need for branch delay slots if prediction happens in time• Instructions leave the pipeline in order – all instructionsthat enter also get placed in the ROB – the process of aninstruction leaving the ROB (in order) is called commit –an instruction commits only if it and all instructions beforeit have completed successfully (without an exception)• To preserve precise exceptions, a result is written into theregister file only when the instruction commits – until then,the result is saved in a temporary register in the ROB4Design Details - II• Instructions get renamed and placed in the issue queue –some operands are available (T1-T6; R1-R32), while others are being produced by instructions in flight (T1-T6)• As instructions finish, they write results into the ROB (T1-T6)and broadcast the operand tag (T1-T6) to the issue queue –instructions now know if their operands are ready• When a ready instruction issues, it reads its operands fromT1-T6 and R1-R32 and executes (out-of-order execution)• Can you have WAW or WAR hazards? By using morenames (T1-T6), name dependences can be avoided5Design Details - III• If instr-3 raises an exception, wait until it reaches the topof the ROB – at this point, R1-R32 contain results for allinstructions up to instr-3 – save registers, save PC of instr-3,and service the exception• If branch is a mispredict, flush all instructions after thebranch and start on the correct path – mispredicted instrswill not have updated registers (the branch cannot commituntil it has completed and the flush happens as soon as thebranch completes)• Potential problems: ?6Managing Register NamesLogicalRegistersR1-R32PhysicalRegistersP1-P64R1 R1+R2R2 R1+R3BEQZ R2R3 R1+R2P33 P1+P2P34 P33+P3BEQZ P34P35 P33+P34At the start, R1-R32 can be found in P1-P32Instructions stop entering the pipeline when P64 is assignedWhat happens on commit?Temporary values are stored in the register file and not the ROB7The Commit Process• On commit, no copy is required• The register map table is updated – the “committed” valueof R1 is now in P33 and not P1 – on an exception, P33 iscopied to memory and not P1• An instruction in the issue queue need not modify itsinput operand when the producer commits• When instruction-1 commits, we no longer have any usefor P1 – it is put in a free pool and a new instruction cannow enter the pipeline for every instr that commits, anew instr can enter the pipeline number of in-flight instrs is a constant = number of extra (rename) registers8The Alpha 21264 Out-of-Order ImplementationBranch predictionand instr fetchR1 R1+R2R2 R1+R3BEQZ R2R3 R1+R2R1 R3+R2Instr Fetch QueueDecode &RenameInstr 1Instr 2Instr 3Instr 4Instr 5Instr 6Reorder Buffer (ROB)P33 P1+P2P34 P33+P3BEQZ P34P35 P33+P34P36 P35+P34Issue Queue (IQ)ALU ALU ALURegister FileP1-P64Results written toregfile and tagsbroadcast to IQSpeculativeReg MapR1P36R2P34CommittedReg MapR1P1R2P29Out-of-Order Loads/StoresLd R1 [R2]LdStLdLdWhat if the issue queue also had load/store instructions? Can we continue executing instructions out-of-order?R3 [R4]R5 [R6]R7 [R8]R9[R10]10Memory Dependence CheckingLd 0x abcdefLdStLdLd 0x abcdefSt 0x abcd00Ld 0x abc000Ld 0x abcd00• The issue queue checks forregister dependences and executes instructions as soonas registers are ready• Loads/stores access memoryas well – must check for RAW,WAW, and WAR hazards formemory as well• Hence, first check for registerdependences to computeeffective addresses; then checkfor memory dependences11Memory Dependence CheckingLd 0x abcdefLdStLdLd 0x abcdefSt 0x abcd00Ld 0x abc000Ld 0x abcd00• Load and store addresses aremaintained in program order inthe Load/Store Queue (LSQ)• Loads can issue if they areguaranteed to not have truedependences with earlier stores• Stores can issue only if we areready to modify memory (can notrecover if an earlier instr raisesan exception)12The Alpha 21264 Out-of-Order ImplementationBranch predictionand instr fetchR1 R1+R2R2 R1+R3BEQZ R2R3 R1+R2R1 R3+R2LD R4 8[R3]ST R4 8[R1]Instr Fetch QueueDecode &RenameInstr 1Instr 2Instr 3Instr 4Instr 5Instr 6Instr 7Reorder Buffer (ROB)P33 P1+P2P34 P33+P3BEQZ P34P35 P33+P34P36 P35+P34P37 8[P35]P37 8[P36]Issue Queue (IQ)ALU ALU ALURegister FileP1-P64Results written toregfile and tagsbroadcast to IQP37 [P35 + 8]P37 [P36 + 8]LSQALUD-CacheCommittedReg MapR1P1R2P2SpeculativeReg MapR1P36R2P3413Title•
View Full Document