DOC PREVIEW
Berkeley COMPSCI 152 - Advanced Out-of-Order Superscalars

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

March 9, 2011 CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 12 - Advanced Out-of-Order Superscalars Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!http://inst.eecs.berkeley.edu/~cs152!March 9, 2011 CS152, Spring 2011 2 Last time in Lecture 11 • Register renaming removes WAR, WAW hazards • In-order fetch/decode, out-of-order execute, in-order commit gives high performance and precise exceptions • Dynamic branch predictors can be quite accurate (>95%) and avoid most control hazards • Branch History Tables (BHTs) just predict direction (later in pipeline) – Just need a few bits per entry (2 bits gives hysteresis) – Need to decode instruction bits to determine whether this is a branch and what the target address is • Branch Target Buffers (BTBs) predict direction and target earlier in pipeline, but bigger entries • Return Address Stack predicts subroutine returnsMarch 9, 2011 CS152, Spring 2011 3 Branch Mispredict Recovery In-order execution machines: – Assume no instruction issued after branch can write-back before branch resolves – Kill all instructions in pipeline behind mispredicted branch – Multiple instructions following branch in program order can complete before branch resolves Out-of-order execution?March 9, 2011 CS152, Spring 2011 4 In-Order Commit for Precise Exceptions • Instructions fetched and decoded into instruction reorder buffer in-order • Execution is out-of-order ( ⇒ out-of-order completion) • Commit (write-back to architectural state, i.e., regfile & memory, is in-order Temporary storage needed in ROB to hold results before commit Fetch Decode Execute Commit Reorder Buffer In-order In-order Out-of-order Kill Kill Kill Exception? Inject handler PCMarch 9, 2011 CS152, Spring 2011 5 Branch Misprediction in Pipeline Fetch Decode Execute Commit Reorder Buffer Kill Kill Kill • Can have multiple unresolved branches in ROB • Can resolve branches out-of-order by killing all the instructions in ROB that follow a mispredicted branch • Must also kill instructions in-flight in execution pipelines Branch Prediction PC Inject correct PC Branch Resolution CompleteMarch 9, 2011 CS152, Spring 2011 t v t v t v Take snapshot of register rename table at each predicted branch, recover earlier snapshot if branch mispredicted Rename Snapshots 6 Recovering ROB/Renaming Table Register File Reorder buffer Load Unit FU FU FU Store Unit < t, result > t1 t2 . . tn Ins# use exec op p1 src1 p2 src2 pd dest data Commit Rename Table r1 t v r2 Ptr2 next to commit Ptr1 next available rollback next availableMarch 9, 2011 CS152, Spring 2011 7 “Data-in-ROB” Design (HP PA8000, Intel Pentium Pro, Core2 Duo & Nehalem) • On dispatch into ROB, ready sources can be in regfile or in ROB dest (copied into src1/src2 if ready before dispatch) • On completion, write to dest field and broadcast to src fields. • On issue, read from ROB src fields Register File holds only committed state Reorder buffer Load Unit FU FU FU Store Unit < t, result > t1 t2 . . tn Ins# use exec op p1 src1 p2 src2 pd dest data CommitMarch 9, 2011 CS152, Spring 2011 Data Movement in Data-in-ROB Design 8 Architectural Register File Read operands during decode Reorder Buffer Write sources after decode Read operands at issue Functional Units Write results at completion Read results at commit Write results at commitMarch 9, 2011 CS152, Spring 2011 Unified Physical Register File (MIPS R10K, Alpha 21264, Intel Pentium 4 & Sandy Bridge) • Rename all architectural registers into a single physical register file during decode, no register values read • Functional units read and write from single unified register file holding committed and temporary registers in execute • Commit only updates mapping of architectural register to physical register, no data movement 9 Unified Physical Register File Read operands at issue Functional Units Write results at completion Commited Register Mapping Decode Stage Register MappingMarch 9, 2011 CS152, Spring 2011 10 Pipeline Design with Physical Regfile Fetch Decode & Rename Reorder Buffer PC Branch Prediction Commit Branch Resolution Branch Unit ALU MEM Store Buffer D$ Execute In-Order In-Order Out-of-Order Physical Reg. File kill kill kill killMarch 9, 2011 CS152, Spring 2011 11 Lifetime of Physical Registers ld r1, (r3) add r3, r1, #4 sub r6, r7, r9 add r3, r3, r6 ld r6, (r1) add r6, r6, r3 st r6, (r1) ld r6, (r11) ld P1, (Px) add P2, P1, #4 sub P3, Py, Pz add P4, P2, P3 ld P5, (P1) add P6, P5, P4 st P6, (P1) ld P7, (Pw) Rename When can we reuse a physical register? When next write of same architectural register commits • Physical regfile holds committed and speculative values • Physical registers decoupled from ROB entries (no data in ROB)March 9, 2011 CS152, Spring 2011 12 Physical Register Management op p1 PR1 p2 PR2 ex use Rd PRd LPRd <R6> P5 <R7> P6 <R3> P7 P0 Pn P1 P2 P3 P4 R5 P5 R6 P6 R7 R0 P8 R1 R2 P7 R3 R4 ROB Rename Table Physical Regs Free List ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) p p p P0 P1 P3 P2 P4 (LPRd requires third read port on Rename Table for each instruction) <R1> P8 pMarch 9, 2011 CS152, Spring 2011 13 Physical Register Management op p1 PR1 p2 PR2 ex use Rd PRd LPRd ROB ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) Free List P0 P1 P3 P2 P4 <R6> P5 <R7> P6 <R3> P7 P0 Pn P1 P2 P3 P4 Physical Regs p p p <R1> P8 p x ld p P7 r1 P0 R5 P5 R6 P6 R7 R0 P8 R1 R2 P7 R3 R4 Rename Table P0 P8March 9, 2011 CS152, Spring 2011 14 Physical Register Management op p1 PR1 p2 PR2 ex use Rd PRd LPRd ROB ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) Free List P0 P1 P3 P2 P4 <R6> P5 <R7> P6 <R3> P7 P0 Pn P1 P2 P3 P4 Physical Regs p p p <R1> P8 p x ld p P7 r1 P0 R5 P5 R6 P6 R7 R0 P8 R1 R2 P7 R3 R4 Rename Table P0 P8 P7 P1 x add P0 r3 P1March 9, 2011 CS152, Spring 2011 15 Physical Register Management op p1 PR1 p2 PR2 ex use Rd PRd LPRd ROB ld r1, 0(r3) add r3, r1, #4 sub r6, r7, r6 add r3, r3, r6 ld r6, 0(r1) Free List P0 P1 P3 P2 P4


View Full Document

Berkeley COMPSCI 152 - Advanced Out-of-Order Superscalars

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Advanced Out-of-Order Superscalars
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Advanced Out-of-Order Superscalars and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Advanced Out-of-Order Superscalars 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?