Berkeley COMPSCI 61C - Lecture Notes - D2447174

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 61C> Lecture Notes

DOC PREVIEW

Berkeley COMPSCI 61C - Lecture Notes

School name University of California, Berkeley

Course Compsci 61c- Machine Structures

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS61C L30 Pipelined Exec ution, part II (1) Garcia 2005 © U CBLecturer PSOE Dan Garciawww.cs.berkeley.edu/~ddgarciainst.eecs.berkeley.edu/~cs61cCS61C : Machine StructuresLecture 30 – Pipelined Execution, part IIwww.technologyreview.com/articles/05/04/wo/wo_040605krotoski.aspGames to learn?! ⇒Recent studies show thatthere may be a place for computergames in traditional K-12 classrooms.There is data that shows dropout ratesare lower, SATs, enjoyment, interest up!CS61C L30 Pipelined Exec ution, part II (2) Garcia 2005 © U CBReview: Pipeline (1/2)• Optimal Pipeline• Each stage is executing part of aninstruction each clock cycle.• One inst. finishes during each clock cycle.• On average, execute far more quickly.• What makes this work?• Similarities between instructions allow usto use same stages for all instructions(generally).• Each stage takes about the same amountof time as all others: little wasted time.CS61C L30 Pipelined Exec ution, part II (3) Garcia 2005 © U CBReview: Pipeline (2/2)• Pipelining is a BIG IDEA• widely used concept• What makes it less than perfect?• Structural hazards: suppose we hadonly one cache?⇒ Need more HW resources• Control hazards: need to worry aboutbranch instructions? ⇒ Delayed branch• Data hazards: an instruction dependson a previous instruction?CS61C L30 Pipelined Exec ution, part II (4) Garcia 2005 © U CBControl Hazard: Branching (1/7)Where do we do the compare for the branch? I$beqInstr 1Instr 2Instr 3Instr 4ALU I$Reg D$ RegALU I$Reg D$ RegALU I$Reg D$ RegALUReg D$ RegALU I$Reg D$ RegInstr.OrderTime (clock cycles)CS61C L30 Pipelined Exec ution, part II (5) Garcia 2005 © U CBControl Hazard: Branching (2/7)• We put branch decision-makinghardware in ALU stage• therefore two more instructions after thebranch will always be fetched, whetheror not the branch is taken• Desired functionality of a branch• if we do not take the branch, don’t wasteany time and continue executingnormally• if we take the branch, don’t execute anyinstructions after the branch, just go tothe desired labelCS61C L30 Pipelined Exec ution, part II (6) Garcia 2005 © U CBControl Hazard: Branching (3/7)• Initial Solution: Stall until decision ismade• insert “no-op” instructions: those thataccomplish nothing, just take time• Drawback: branches take 3 clock cycleseach (assuming comparator is put inALU stage)CS61C L30 Pipelined Exec ution, part II (7) Garcia 2005 © U CBControl Hazard: Branching (4/7)• Optimization #1:• move asynchronous comparator up toStage 2• as soon as instruction is decoded(Opcode identifies is as a branch),immediately make a decision and set thevalue of the PC (if necessary)• Benefit: since branch is complete inStage 2, only one unnecessaryinstruction is fetched, so only one no-opis needed• Side Note: This means that branches areidle in Stages 3, 4 and 5.CS61C L30 Pipelined Exec ution, part II (8) Garcia 2005 © U CB• Insert a single no-op (bubble)Control Hazard: Branching (5/7)addbeqlwALU I$Reg D$ RegALU I$Reg D$ RegALUReg D$ Reg I$Instr.OrderTime (clock cycles)bubble• Impact: 2 clock cycles per branchinstruction ⇒ slowCS61C L30 Pipelined Exec ution, part II (9) Garcia 2005 © U CBControl Hazard: Branching (6/7)• Optimization #2: Redefine branches• Old definition: if we take the branch,none of the instructions after the branchget executed by accident• New definition: whether or not we takethe branch, the single instructionimmediately following the branch getsexecuted (called the branch-delay slot)• The term “Delayed Branch” meanswe always execute inst after branchCS61C L30 Pipelined Exec ution, part II (10) Garcia 2005 © U CBControl Hazard: Branching (7/7)• Notes on Branch-Delay Slot• Worst-Case Scenario: can always put ano-op in the branch-delay slot• Better Case: can find an instructionpreceding the branch which can beplaced in the branch-delay slot withoutaffecting flow of the program- re-ordering instructions is a commonmethod of speeding up programs- compiler must be very smart in order tofind instructions to do this- usually can find such an instruction at least50% of the time- Jumps also have a delay slot…CS61C L30 Pipelined Exec ution, part II (11) Garcia 2005 © U CBExample: Nondelayed vs. Delayed Branchadd $1 ,$2,$3sub $4, $5,$6beq $1, $4, Exitor $8, $9 ,$10xor $10, $1,$11Nondelayed Branchadd $1 ,$2,$3sub $4, $5,$6beq $1, $4, Exitor $8, $9 ,$10xor $10, $1,$11Delayed BranchExit: Exit:CS61C L30 Pipelined Exec ution, part II (12) Garcia 2005 © U CBData Hazards (1/2)add $t0, $t1, $t2sub $t4, $t0 ,$t3and $t5, $t0 ,$t6or $t7, $t0 ,$t8xor $t9, $t0 ,$t10• Consider the following sequence ofinstructionsCS61C L30 Pipelined Exec ution, part II (13) Garcia 2005 © U CB Dependencies backwards in time are hazardsData Hazards (2/2)sub $t4,$t0,$t3ALUI$ Reg D$ Regand $t5,$t0,$t6ALUI$Reg D$ Regor $t7,$t0,$t8I$ALUReg D$ Regxor $t9,$t0,$t10ALUI$Reg D$ Regadd $t0,$t1,$t2IF ID/RF EX MEM WBALUI$Reg D$RegInstr.OrderTime (clock cycles)CS61C L30 Pipelined Exec ution, part II (14) Garcia 2005 © U CB• Forward result from one stage to anotherData Hazard Solution: Forwardingsub $t4,$t0,$t3ALUI$Reg D$ Regand $t5,$t0,$t6ALUI$Reg D$ Regor $t7,$t0,$t8I$ALUReg D$ Regxor $t9,$t0,$t10ALUI$Reg D$ Regadd $t0,$t1,$t2IF ID/RF EX MEM WBALUI$Reg D$Reg “or” hazard solved by register hardwareCS61C L30 Pipelined Exec ution, part II (15) Garcia 2005 © U CB• Dependencies backwards in time arehazardsData Hazard: Loads (1/4)sub $t3,$t0,$t2ALUI$Reg D$ Reglw $t0,0($t1)IF ID/RF EX MEM WBALUI$Reg D$Reg• Can’t solve with forwarding• Must stall instruction dependent onload, then forward (more hardware)CS61C L30 Pipelined Exec ution, part II (16) Garcia 2005 © U CB• Hardware must stall pipeline• Called “interlock”Data Hazard: Loads (2/4)sub $t3,$t0,$t2ALUI$Reg D$ Regbubbleand $t5,$t0,$t4ALUI$Reg D$ Regbubbleor $t7,$t0,$t6I$ALUReg D$bubblelw $t0, 0($t1)IF ID/RF EX MEM WBALUI$Reg D$RegCS61C L30 Pipelined Exec ution, part II (17) Garcia 2005 © U CBData Hazard: Loads (3/4)• Instruction slot after a load is called“load delay slot”• If that instruction uses the result of theload, then the hardware interlock willstall it for one cycle.• If the compiler puts an unrelatedinstruction in that slot, then no stall• Letting the hardware stall theinstruction in the delay slot isequivalent to putting a nop in the slot(except the latter

View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

Berkeley COMPSCI 61C - Lecture Notes

Sign up for free to view:

Please select your school