DOC PREVIEW
Berkeley COMPSCI 152 - Branch Prediction, Explicit Renaming, ILP

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152Computer Architecture and EngineeringLecture 17Branch Prediction, Explicit Renaming, ILPApril 5, 2004John Kubiatowicz (http.cs.berkeley.edu/~kubitron)lecture slides: http://inst.eecs.berkeley.edu/~cs152/4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.2Review: Tomasulo OrganizationFP addersFP addersAdd1Add2Add3FP multipliersFP multipliersMult1Mult2From MemFP RegistersReservation StationsCommon Data Bus (CDB)To MemFP OpQueueLoad BuffersStore BuffersLoad1Load2Load3Load4Load5Load64/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.3Review: Three Stages of Tomasulo Algorithm1. Issue—get instruction from FP Op QueueIf reservation station free (no structural hazard), control issues instr & sends operands (renames registers).2. Execution—operate on operands (EX)When both operands ready then execute;if not ready, watch Common Data Bus for result3. Write result—finish execution (WB)Write on Common Data Bus to all awaiting units; mark reservation station available° Normal data bus: data + destination (“go to” bus)° Common data bus: data + source (“come from” bus)• 64 bits of data + 4 bits of Functional Unit source address• Write if matches expected Functional Unit (produces result)• Does the broadcast4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.4Review: Tomasulo Architecture° Reservations stations:• renaming to larger set of registers + buffering source operands• Prevents registers as bottleneck• Avoids WAR, WAW hazards of Scoreboard° Not limited to basic blocks:• integer units gets ahead, beyond branches° Dynamic Scheduling:• Scoreboarding/Tomasulo• In-order issue, out-of-order execution, out-of-order commit° Tomasulo can unroll loops dynamically in hardware!• Need: renaming (different physical names for different iterations)• Fast branch computation4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.53 DIVD ROB2,R(F6)3 DIVD ROB2,R(F6)2 ADDD R(F4),ROB12 ADDD R(F4),ROB1Review: Tomasulo With Reorder buffer (ROB)ToMemoryFP addersFP addersFP multipliersFP multipliersReservation StationsFP OpQueueROB7ROB6ROB5ROB5ROB3ROB2ROB1----F0F0<val2><val2><val2><val2>ST 0(R3),F0ST 0(R3),F0ADDD F0,F4,F6ADDD F0,F4,F6YYExExF4F4M[10]M[10]LD F4,0(R3)LD F4,0(R3)YY----BNE F2,<…>BNE F2,<…>NNF2F2F10F10F0F0DIVD F2,F10,F6DIVD F2,F10,F6ADDD F10,F4,F0ADDD F10,F4,F0LD F0,10(R2)LD F0,10(R2)NNNNNNDone?DestDestOldestNewestfrom Memory1 10+R21 10+R2DestReorder Buffer(ROB)Registers4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.61. Issue—get instruction from FP Op Queue• If reservation station and reorder buffer slot free, issue instr & send operands &reorder buffer no. for destination (this stage sometimes called “dispatch”)2. Execution—operate on operands (EX)• When both operands ready then execute; if not ready, watch CDB for result; when both in reservation station, execute; checks RAW (sometimes called “issue”)3. Write result—finish execution (WB)• Write on Common Data Bus to all awaiting FUs & reorder buffer4. Commit—update register with reorder buffer (ROB) result• When instr. at head of reorder buffer & result present, update register with result (or store to memory) and remove instr from reorder buffer• Stores only commit to memory when reach head of ROB• Values only overwrite registers when they reach head• Mispredicted branch or interrupt flushes reorder bufferNOTES:• In-order issue, Out-of-order execution, In-order commit• Can always throw out contents of reorder buffer (must cancel running ops) • Precise exception point is instruction at head of bufferReview: Four Steps of Speculative Tomasulo Algorithm4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.7----F0F0M[10]M[10]------ST 0(R3),F4ST 0(R3),F4ADDD F0,F4,F6ADDD F0,F4,F6YYExExF4F4M[10]M[10]LD F4,0(R3)LD F4,0(R3)YY----BNE F2,<…>BNE F2,<…>NN3 DIVD ROB2,R(F6)3 DIVD ROB2,R(F6)2 ADDD R(F4),ROB12 ADDD R(F4),ROB1Tomasulo With Reorder buffer: Memory DisambiguationToMemoryFP addersFP addersFP multipliersFP multipliersReservation StationsFP OpQueueROB7ROB6ROB5ROB4ROB3ROB2ROB1F2F2F10F10F0F0DIVD F2,F10,F6DIVD F2,F10,F6ADDD F10,F4,F0ADDD F10,F4,F0LD F0,10(R2)LD F0,10(R2)NNNNNNDone?DestDestOldestNewestfrom Memory1 10+R21 10+R2DestReorder BufferRegistersWhat about memoryhazards???4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.8Memory Disambiguation: Handling RAW Hazards in memory° Question: Given a load that follows a store in program order, are the two related?• (Alternatively: is there a RAW hazard between the store and the load)?Eg: st 0(R2),R5ld R6,0(R3)° Can we go ahead and start the load early? • Store address could be delayed for a long time by some calculation that leads to R2 (divide?). • We might want to issue/begin execution of both operations in same cycle.° Two techiques:• No Speculation: we are not allowed to start load until we know for sure that address 0(R2) ≠ 0(R3)• Speculation: We might guess at whether or not they are dependent (called “dependence speculation”) and use reorder buffer to fixup if we are wrong.4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.9° Need buffer to keep track of all outstanding stores to memory, in program order.• Keep track of address (when becomes available) and value (when becomes available)• FIFO ordering: will retire stores from this buffer in program order° When issuing a load, record current head of store queue (know which stores are ahead of you).° When have address for load, check store queue:• If any store prior to load is waiting for its address:- If not speculating, stall load- If speculating, send request to memory (predict no dependence)• If load address matches earlier store address (associative lookup), then we have a memory-induced RAW hazard:- store value available ⇒ return value- store value not available ⇒ return ROB number of source • Otherwise, send out request to memory° Actual stores commit in order, so no worry about WAR/WAW hazards through memory.Hardware Support for Memory Disambiguation4/05/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec17.10----LD F4, 10(R3)LD F4, 10(R3)NNMemory Disambiguation:ToMemoryFP addersFP addersFP multipliersFP multipliersReservation StationsFP OpQueueROB7ROB6ROB5ROB4ROB3ROB2ROB1F2F2F0F0----<val 1><val 1>ST 10(R3), F5 ST 10(R3), F5 LD F0,32(R2)LD F0,32(R2)ST 0(R3), F4ST 0(R3), F4NNNNYYDone?DestDestOldestNewestfrom Memory2 32+R22 32+R24 ROB34 ROB3DestReorder BufferRegisters4/05/04 ©UCB Spring


View Full Document

Berkeley COMPSCI 152 - Branch Prediction, Explicit Renaming, ILP

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Branch Prediction, Explicit Renaming, ILP
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Branch Prediction, Explicit Renaming, ILP and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Branch Prediction, Explicit Renaming, ILP 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?