DOC PREVIEW
Berkeley COMPSCI 61C - Lecture Notes

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS61C L31 Pipelined Execution, part II (1)Garcia, Fall 2004 © UCBLecturer PSOE Dan Garciawww.cs.berkeley.edu/~ddgarciainst.eecs.berkeley.edu/~cs61cCS61C : Machine StructuresLecture 31 – Pipelined Execution, part II2004-11-10The Incredibles!⇒Election Data isnow available…Purple America!www.princeton.edu/~rvdb/JAVA/election2004/www.usatoday.com/news/politicselections/vote2004/countymap.htmCS61C L31 Pipelined Execution, part II (2)Garcia, Fall 2004 © UCBReview: Pipeline (1/2)• Optimal Pipeline• Each stage is executing part of aninstruction each clock cycle.• One inst. finishes during each clock cycle.• On average, execute far more quickly.• What makes this work?• Similarities between instructions allow usto use same stages for all instructions(generally).• Each stage takes about the same amountof time as all others: little wasted time.CS61C L31 Pipelined Execution, part II (3)Garcia, Fall 2004 © UCBReview: Pipeline (2/2)• Pipelining is a BIG IDEA• widely used concept• What makes it less than perfect?• Structural hazards: suppose we hadonly one cache?⇒ Need more HW resources• Control hazards: need to worry aboutbranch instructions? ⇒ Delayed branch• Data hazards: an instruction dependson a previous instruction?CS61C L31 Pipelined Execution, part II (4)Garcia, Fall 2004 © UCBControl Hazard: Branching (1/7)Where do we do the compare for the branch? I$beqInstr 1Instr 2Instr 3Instr 4ALU I$Reg D$ RegALU I$Reg D$ RegALU I$Reg D$ RegALUReg D$ RegALU I$Reg D$ RegInstr.OrderTime (clock cycles)CS61C L31 Pipelined Execution, part II (5)Garcia, Fall 2004 © UCBControl Hazard: Branching (2/7)• We put branch decision-makinghardware in ALU stage• therefore two more instructions after thebranch will always be fetched, whetheror not the branch is taken• Desired functionality of a branch• if we do not take the branch, don’t wasteany time and continue executingnormally• if we take the branch, don’t execute anyinstructions after the branch, just go tothe desired labelCS61C L31 Pipelined Execution, part II (6)Garcia, Fall 2004 © UCBControl Hazard: Branching (3/7)• Initial Solution: Stall until decision ismade• insert “no-op” instructions: those thataccomplish nothing, just take time• Drawback: branches take 3 clock cycleseach (assuming comparator is put inALU stage)CS61C L31 Pipelined Execution, part II (7)Garcia, Fall 2004 © UCBControl Hazard: Branching (4/7)• Optimization #1:• move asynchronous comparator up toStage 2• as soon as instruction is decoded(Opcode identifies is as a branch),immediately make a decision and set thevalue of the PC (if necessary)• Benefit: since branch is complete inStage 2, only one unnecessaryinstruction is fetched, so only one no-opis needed• Side Note: This means that branches areidle in Stages 3, 4 and 5.CS61C L31 Pipelined Execution, part II (8)Garcia, Fall 2004 © UCB• Insert a single no-op (bubble)Control Hazard: Branching (5/7)addbeqlwALU I$Reg D$ RegALU I$Reg D$ RegALUReg D$ Reg I$Instr.OrderTime (clock cycles)bubble• Impact: 2 clock cycles per branchinstruction ⇒ slowCS61C L31 Pipelined Execution, part II (9)Garcia, Fall 2004 © UCBControl Hazard: Branching (6/7)• Optimization #2: Redefine branches• Old definition: if we take the branch,none of the instructions after the branchget executed by accident• New definition: whether or not we takethe branch, the single instructionimmediately following the branch getsexecuted (called the branch-delay slot)• The term “Delayed Branch” meanswe always execute inst after branchCS61C L31 Pipelined Execution, part II (10)Garcia, Fall 2004 © UCBControl Hazard: Branching (7/7)• Notes on Branch-Delay Slot• Worst-Case Scenario: can always put ano-op in the branch-delay slot• Better Case: can find an instructionpreceding the branch which can beplaced in the branch-delay slot withoutaffecting flow of the program- re-ordering instructions is a commonmethod of speeding up programs- compiler must be very smart in order tofind instructions to do this- usually can find such an instruction at least50% of the time- Jumps also have a delay slot…CS61C L31 Pipelined Execution, part II (11)Garcia, Fall 2004 © UCBExample: Nondelayed vs. Delayed Branchadd $1 ,$2,$3sub $4, $5,$6beq $1, $4, Exitor $8, $9 ,$10xor $10, $1,$11Nondelayed Branchadd $1 ,$2,$3sub $4, $5,$6beq $1, $4, Exitor $8, $9 ,$10xor $10, $1,$11Delayed BranchExit: Exit:CS61C L31 Pipelined Execution, part II (12)Garcia, Fall 2004 © UCBData Hazards (1/2)add $t0, $t1, $t2sub $t4, $t0 ,$t3and $t5, $t0 ,$t6or $t7, $t0 ,$t8xor $t9, $t0 ,$t10• Consider the following sequence ofinstructionsCS61C L31 Pipelined Execution, part II (13)Garcia, Fall 2004 © UCB Dependencies backwards in time are hazardsData Hazards (2/2)sub $t4,$t0,$t3ALUI$Reg D$ Regand $t5,$t0,$t6ALUI$Reg D$ Regor $t7,$t0,$t8I$ALUReg D$ Regxor $t9,$t0,$t10ALUI$Reg D$ Regadd $t0,$t1,$t2IF ID/RF EX MEM WBALUI$Reg D$RegInstr.OrderTime (clock cycles)CS61C L31 Pipelined Execution, part II (14)Garcia, Fall 2004 © UCB• Forward result from one stage to anotherData Hazard Solution: Forwardingsub $t4,$t0,$t3ALUI$Reg D$ Regand $t5,$t0,$t6ALUI$Reg D$ Regor $t7,$t0,$t8I$ALUReg D$ Regxor $t9,$t0,$t10ALUI$Reg D$ Regadd $t0,$t1,$t2IF ID/RF EX MEM WBALUI$Reg D$Reg “or” hazard solved by register hardwareCS61C L31 Pipelined Execution, part II (15)Garcia, Fall 2004 © UCB• Dependencies backwards in time arehazardsData Hazard: Loads (1/4)sub $t3,$t0,$t2ALUI$Reg D$ Reglw $t0,0($t1)IF ID/RF EX MEM WBALUI$Reg D$Reg• Can’t solve with forwarding• Must stall instruction dependent onload, then forward (more hardware)CS61C L31 Pipelined Execution, part II (16)Garcia, Fall 2004 © UCB• Hardware must stall pipeline• Called “interlock”Data Hazard: Loads (2/4)sub $t3,$t0,$t2ALUI$Reg D$ Regbubbleand $t5,$t0,$t4ALUI$Reg D$ Regbubbleor $t7,$t0,$t6I$ALUReg D$bubblelw $t0, 0($t1)IF ID/RF EX MEM WBALUI$Reg D$RegCS61C L31 Pipelined Execution, part II (17)Garcia, Fall 2004 © UCBData Hazard: Loads (3/4)• Instruction slot after a load is called“load delay slot”• If that instruction uses the result of theload, then the hardware interlock willstall it for one cycle.• If the compiler puts an unrelatedinstruction in that slot, then no stall• Letting the hardware stall theinstruction in the delay slot isequivalent to putting a nop in the slot(except the latter uses more code space)CS61C L31 Pipelined


View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?