U of U CS 6810 - Lecture Notes - D2744452

Home> Schools> University of Utah> Computer Science (CS) > CS 6810> Lecture Notes

U of U CS 6810 - Lecture Notes

Pages 19

Download Save

Unformatted text preview:

1 Lecture 3: Pipelining Basics • Biggest contributors to performance: clock speed, parallelism • Today: basic pipelining implementation (Sections A.1-A.3) • Reminders:  Sign up for the class mailing list  First assignment is on-line, due next Thursday  TA office hours: Jim King, Wed 1-2pm, MEB 3423; Bharathan Rajaram, TBA  Class notes2 The Assembly Line A Start and finish a job before moving to the next Time Jobs Break the job into smaller stages B C A B C A B C A B C Unpipelined Pipelined3 Quantitative Effects • As a result of pipelining:  Time in ns per instruction goes up  Number of cycles per instruction goes up (note the increase in clock speed)  Total execution time goes down, resulting in lower time per instruction  Average cycles per instruction increases slightly  Under ideal conditions, speedup = ratio of elapsed times between successive instruction completions = number of pipeline stages = increase in clock speed4 A 5-Stage Pipeline5 A 5-Stage Pipeline Use the PC to access the I-cache and increment PC by 46 A 5-Stage Pipeline Read registers, compare registers, compute branch target; for now, assume branches take 2 cyc (there is enough work that branches can easily take more)7 A 5-Stage Pipeline ALU computation, effective address computation for load/store8 A 5-Stage Pipeline Memory access to/from data cache, stores finish in 4 cycles9 A 5-Stage Pipeline Write result of ALU computation or load into register file10 Conflicts/Problems • I-cache and D-cache are accessed in the same cycle – it helps to implement them separately • Registers are read and written in the same cycle – easy to deal with if register read/write time equals cycle time/2 (else, use bypassing) • Branch target changes only at the end of the second stage -- what do you do in the meantime? • Data between stages get latched into registers (overhead that increases latency per instruction)11 Hazards • Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource • Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction • Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways12 Structural Hazards • Example: a unified instruction and data cache  stage 4 (MEM) and stage 1 (IF) can never coincide • The later instruction and all its successors are delayed until a cycle is found when the resource is free  these are pipeline bubbles • Structural hazards are easy to eliminate – increase the number of resources (for example, implement a separate instruction and data cache)13 Data Hazards14 Bypassing • Some data hazard stalls can be eliminated: bypassing15 Example add R1, R2, R3 lw R4, 8(R1)16 Example lw R1, 8(R2) lw R4, 8(R1)17 Example lw R1, 8(R2) sw R1, 8(R3)18 Summary • For the 5-stage pipeline, bypassing can eliminate delays between the following example pairs of instructions: add/sub R1, R2, R3 add/sub/lw/sw R4, R1, R5 lw R1, 8(R2) sw R1, 4(R3) • The following pairs of instructions will have intermediate stalls: lw R1, 8(R2) add/sub/lw R3, R1, R4 or sw R3, 8(R1) fmul F1, F2, F3 fadd F5, F1, F419 Title •

View Full Document

U of U CS 6810 - Lecture Notes

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

U of U CS 6810 - Lecture Notes

Sign up for free to view:

Please select your school