U of U CS 6810 - Pipelining Basics - D3049422

Home> Schools> University of Utah> Computer Science (CS) > CS 6810> Pipelining Basics

U of U CS 6810 - Pipelining Basics

Pages 19

Download Save

Unformatted text preview:

PowerPoint PresentationSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 191Lecture 3: Pipelining Basics • Biggest contributors to performance: clock speed, parallelism• Today: basic pipelining implementation (Sections A.1-A.3)• Reminders: Sign up for the class mailing list  First assignment is on-line, due next Thursday TA office hours: Jim King, Wed 1-2pm, MEB 3423; Bharathan Rajaram, TBA Class notes2The Assembly LineAStart and finish a job before moving to the nextTimeJobsBreak the job into smaller stagesB CA B CA B CA B CUnpipelinedPipelined3Quantitative Effects• As a result of pipelining: Time in ns per instruction goes up Number of cycles per instruction goes up (note the increase in clock speed) Total execution time goes down, resulting in lower time per instruction Average cycles per instruction increases slightly Under ideal conditions, speedup = ratio of elapsed times between successive instruction completions = number of pipeline stages = increase in clock speed4A 5-Stage Pipeline5A 5-Stage Pipeline Use the PC to access the I-cache and increment PC by 46A 5-Stage PipelineRead registers, compare registers, compute branch target; for now, assumebranches take 2 cyc (there is enough work that branches can easily take more)7A 5-Stage PipelineALU computation, effective address computation for load/store8A 5-Stage PipelineMemory access to/from data cache, stores finish in 4 cycles9A 5-Stage PipelineWrite result of ALU computation or load into register file10Conflicts/Problems• I-cache and D-cache are accessed in the same cycle – it helps to implement them separately• Registers are read and written in the same cycle – easy to deal with if register read/write time equals cycle time/2 (else, use bypassing)• Branch target changes only at the end of the second stage -- what do you do in the meantime?• Data between stages get latched into registers (overhead that increases latency per instruction)11Hazards• Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource• Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction• Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways12Structural Hazards• Example: a unified instruction and data cache  stage 4 (MEM) and stage 1 (IF) can never coincide• The later instruction and all its successors are delayed until a cycle is found when the resource is free  these are pipeline bubbles• Structural hazards are easy to eliminate – increase the number of resources (for example, implement a separate instruction and data cache)13Data Hazards14Bypassing• Some data hazard stalls can be eliminated: bypassing15Exampleadd R1, R2, R3lw R4, 8(R1)16Example lw R1, 8(R2) lw R4, 8(R1)17Example lw R1, 8(R2) sw R1, 8(R3)18Summary• For the 5-stage pipeline, bypassing can eliminate delays between the following example pairs of instructions: add/sub R1, R2, R3 add/sub/lw/sw R4, R1, R5 lw R1, 8(R2) sw R1, 4(R3)• The following pairs of instructions will have intermediate stalls: lw R1, 8(R2) add/sub/lw R3, R1, R4 or sw R3, 8(R1) fmul F1, F2, F3 fadd F5, F1, F419Title•

View Full Document


School:
Email:
New Password:
Confirm Password:

U of U CS 6810 - Pipelining Basics

Sign up for free to view:

Please select your school