CS 152 Computer Architecture and Engineering Lecture 15 - Advanced SuperscalarsLast time in Lecture 14“Data in ROB” Design (HP PA8000, Pentium Pro, Core2Duo)Unified Physical Register File (MIPS R10K, Alpha 21264, Pentium 4)PowerPoint PresentationLifetime of Physical RegistersPhysical Register ManagementSlide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14CS152 AdministriviaReorder Buffer Holds Active Instruction WindowSuperscalar Register RenamingSlide 18Memory DependenciesIn-Order Memory QueueConservative O-o-O Load ExecutionAddress SpeculationMemory Dependence Prediction (Alpha 21264)Speculative Loads / StoresSpeculative Store BufferSlide 26Slide 27AcknowledgementsCS 152 Computer Architecture and Engineering Lecture 15 - Advanced SuperscalarsKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California at Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.edu/~cs1524/1/2008 CS152-Spring’082Last time in Lecture 14•Control hazards are serious impediment to superscalar performance•Dynamic branch predictors can be quite accurate (>95%) and avoid most control hazards•Branch History Tables (BHTs) just predict direction (later in pipeline)–Just need a few bits per entry (2 bits gives hysteresis)–Need to decode instruction bits to determine whether this is a branch and what the target address is•Branch Target Buffer (BTB) predicts whether a branch, and target address–Needs PC tag, predicted Next-PC, and direction–Just needs PC of instruction to predict target of branch (if any)•Return address stack: special form of BTB used to predict subroutine return addresses4/1/2008 CS152-Spring’083“Data in ROB” Design(HP PA8000, Pentium Pro, Core2Duo)• On dispatch into ROB, ready sources can be in regfile or in ROB dest (copied into src1/src2 if ready before dispatch)• On completion, write to dest field and broadcast to src fields.• On issue, read from ROB src fieldsRegister Fileholds only committed stateReorderbufferLoad UnitFU FUFUStore Unit< t, result >t1t2..tnIns# use exec op p1 src1 p2 src2 pd dest dataCommit4/1/2008 CS152-Spring’084Unified Physical Register File(MIPS R10K, Alpha 21264, Pentium 4)• One regfile for both committed and speculative values (no data in ROB)• During decode, instruction result allocated new physical register, source regs translated to physical regs through rename table• Instruction reads data from regfile at start of execute (not in decode)• Write-back updates reg. busy bits on instructions in ROB (assoc. search)• Snapshots of rename table taken at every branch to recover mispredicts• On exception, renaming undone in reverse order of issue (MIPS R10000)Rename Tabler1 tir2 tjFUFUStore Unit< t, result >FULoad UnitFUt1t2.tnRegFileSnapshots for mispredict recovery(ROB not shown)4/1/2008 CS152-Spring’085Pipeline Design with Physical RegfileFetchDecode & RenameReorder BufferPCBranchPredictionUpdate predictorsCommitBranchResolutionBranchUnitALUMEMStore BufferD$ExecuteIn-OrderIn-OrderOut-of-OrderPhysical Reg. Filekillkillkillkill4/1/2008 CS152-Spring’086Lifetime of Physical Registersld r1, (r3)add r3, r1, #4sub r6, r7, r9add r3, r3, r6ld r6, (r1)add r6, r6, r3st r6, (r1)ld r6, (r11)ld P1, (Px)add P2, P1, #4sub P3, Py, Pzadd P4, P2, P3ld P5, (P1)add P6, P5, P4st P6, (P1)ld P7, (Pw)RenameWhen can we reuse a physical register?• Physical regfile holds committed and speculative values• Physical registers decoupled from ROB entries (no data in ROB)4/1/2008 CS152-Spring’087Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRd<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4R5P5R6P6R7R0P8R1R2P7R3R4ROBRename TablePhysical RegsFree Listld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)pppP0P1P3P2P4(LPRd requires third read port on Rename Table for each instruction)<R1>P8p4/1/2008 CS152-Spring’088Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRdROBld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8px ld p P7 r1 P0R5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P84/1/2008 CS152-Spring’089Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRdROBld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8px ld p P7 r1 P0R5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P8P7P1x add P0 r3 P14/1/2008 CS152-Spring’0810Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRdROBld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8px ld p P7 r1 P0R5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P8P7P1x add P0 r3 P1P5P3x sub p P6 p P5 r6 P34/1/2008 CS152-Spring’0811Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRdROBld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8px ld p P7 r1 P0R5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P8P7P1x add P0 r3 P1P5P3x sub p P6 p P5 r6 P3P1P2x add P1 P3 r3 P24/1/2008 CS152-Spring’0812Physical Register Managementop p1 PR1 p2 PR2exuse Rd PRdLPRdROBld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8px ld p P7 r1 P0R5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P8P7P1x add P0 r3 P1P5P3x sub p P6 p P5 r6 P3P1P2x add P1 P3 r3 P2x ld P0 r6 P4P3P44/1/2008 CS152-Spring’0813op p1 PR1 p2 PR2exuse Rd PRdLPRdROBx ld p P7 r1 P0x add P0 r3 P1x sub p P6 p P5 r6 P3x ld p P7 r1 P0Physical Register Managementld r1, 0(r3)add r3, r1, #4sub r6, r7, r6add r3, r3, r6ld r6, 0(r1)Free ListP0P1P3P2P4<R6>P5<R7>P6<R3>P7P0PnP1P2P3P4Physical Regsppp<R1>P8pR5P5R6P6R7R0P8R1R2P7R3R4Rename TableP0P8P7P1P5P3P1P2x add P1 P3 r3 P2x ld P0
View Full Document