DOC PREVIEW
UMD CMSC 411 - Unit 3 – Instruction Pipelining

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CMSC 411 - A. Sussman (from D. O'Leary) 1Computer Systems ArchitectureCMSC 411Unit 3 – Instruction PipeliningAlan SussmanSeptember 21, 2004CMSC 411 - Alan Sussman 2So far• What we mean by computer performance• How to measure it• How instruction sets are designed• How the design influences performanceCMSC 411 - Alan Sussman 3What’s next• A variety of hardware and compiler techniques to speed the execution of programs– What is pipelining? (Section A.1)– How does MIPS divide instructions into stages or cycles? (A.1)– What kinds of overheads are there in pipelining? (A.1)– How much speedup do we get? (A.1)– What are structural hazards, data hazards, and control hazards? (A.2)– How are these techniques used to reduce stalls:• data forwarding? (A.2)• instruction reordering? (A.2)• compiler approaches to reduce branch delays? (A.2)CMSC 411 - Alan Sussman 4What is pipelining?• Pipelining is an implementation technique whereby multiple instructions are overlapped in execution• In other words, at any given moment in the execution of a computer program, many different instructions are at various stages of completion!• Example: Car washCMSC 411 - Alan Sussman 5Throughput• The number of instructions that completeper unit time• Instructions take many clock cyclesIdeally, every clock cycle, we want a new instruction to begin (and end)• This is how we will improve throughputCMSC 411 - Alan Sussman 6A MIPS implementation without pipelining• Recall from CMSC 311 that instructions execute in different stages or cycles– Instruction fetch cycle (IF): fetch the instruction from memory and update the program counter (PC) to point to the next instruction. Note: We’re not using the NPCregister that the book introduces.IR ← Mem[PC]PC ← PC + 4CMSC 411 - A. Sussman (from D. O'Leary) 2CMSC 411 - Alan Sussman 7MIPS w/o pipelining (cont.)– Instruction decode cycle (ID) : Put the operands in pipeline registers A and B. Sign-extend the low order 16 bits of the IR and store in pipeline register Imm. (This sometimes holds the "immediate" constant.)A ← Regs[IR6..10]B ← Regs[IR11..15]Imm ← ((IR16) 16##IR16..31)CMSC 411 - Alan Sussman 8MIPS w/o pipelining (cont.)– Execution cycle (EX) : Use the ALU– If memory reference:ALUOutput ← A + Imm– If register-register ALU instruction:ALUOutput ← A op B– If register-immediate ALU instruction:ALUOutput ← A op Imm– If branch instruction: compute the branch address and check the branch condition:ALUOutput ← PC + (Imm << 2)Cond ← (A op 0)(but PC or Imm should be adjusted down by 4 to make this work right).CMSC 411 - Alan Sussman 9MIPS w/o pipelining (cont.)– Memory access cycle (MEM) : finish loads, stores, and branches:Load: LMD ← Mem[ALUOutput]Store: Mem[ALUOutput] ← BBranch: if Cond then PC ← ALUOutputelse PC is okCMSC 411 - Alan Sussman 10MIPS w/o pipelining (cont.)– Write-back cycle (WB) : update the registersRegister-register ALU instruction: Regs[IR16..20] ← ALUOutputRegister-immediate ALU instruction: Regs[IR11..15] ← ALUOutputLoad instruction: Regs[IR11..15] ← LMDCMSC 411 - Alan Sussman 11Some notes• All instructions take 4-5 cycles• Some of the temporary registers could be eliminated, but they become convenient in a minute for pipelining• Could build the architecture so that these 5 cycles are one clock cycle, not 5• We assume that there are separate data paths for instruction memory and for data memory (so loads and stores don’t interfere with instruction fetch), usually implemented via separate cachesCMSC 411 - Alan Sussman 12A pipelined implementation of MIPSSuppose we have 5 load/store instructions to execute, all involving different registers and memory locations. Fig. A.1WBMEMEXIDIFi+4WBMEMEXIDIFi+3WBMEMEXIDIFi+2WBMEMEXIDIFi+1WBMEMEXIDIFi987654321Inst. #Clock cycleCMSC 411 - A. Sussman (from D. O'Leary) 3CMSC 411 - Alan Sussman 13With pipeline registersFigure A.3CMSC 411 - Alan Sussman 14Communication paths and timingFigure A.17CMSC 411 - Alan Sussman 15Ideal throughput from pipelining• Example 1: Suppose we have 100 load/store instructions to execute, and we arrange to have no register conflicts• Time for original MIPS implementation: 100 instructions × 5 cycles per instruction = 500 cycles• Time for pipelined MIPS implementation: 1st instruction takes 5 cycles. The others each finish 1 cycle later than the preceding one.Time = 5 + 99 = 104 cycles• Speedup = 500/104 ≈ 5• This is the “ideal case” - life is not that simpleCMSC 411 - Alan Sussman 16A more realistic case• Example 2: Suppose that in our original MIPS implementation, we could run the stages this fast:– IF - 10ns– ID - 8ns– EX - 7ns– MEM - 10ns– WB - 5ns• Then time for original MIPS implementation of 100 load/store instructions: – 100 instructions × 40ns per instruction = 4000 nsCMSC 411 - Alan Sussman 17Example 2 (cont.)• Time for pipelined MIPS implementation: We have to synchronize the stages, so we need to run the clock at 10 ns• 1st instruction takes 50 ns. The others each finish 1 cycle later than the preceding one.– Time = 50 ns + 99*10 ns = 1040 ns• Speedup = 4000/1040 ≈ 3.85CMSC 411 - Alan Sussman 18Even more realistic case• Example 3: The original MIPS implementation doesn't always need to use the MEM cycle– IF -10ns– ID - 8ns– EX - 7ns– MEM - 10ns– WB - 5ns• Suppose that only 30% of instructions use memory access. So, on average, for every 100 instructions, we have about 70 that use 4 stages and 30 that use 5.CMSC 411 - A. Sussman (from D. O'Leary) 4CMSC 411 - Alan Sussman 19Example 3 (cont.)• Time for original MIPS implementation:– 70 instructions × 30 ns per instruction +30 instructions × 40 ns per instruction =3300 ns• Time for pipelined MIPS implementation: We have to synchronize the stages, so we need to run the clock at 10 ns, and we need 5 cycles for everyinstruction.– 1st instruction takes 50 ns. The others each finish 1 cycle later than the preceding one– Time = 50 ns + 99*10 ns = 1040 ns• Speedup = 3300/1040 ≈ 3.17CMSC 411 - Alan Sussman 20Overhead of pipelining• We just summarized the two major overhead costs in pipelining:– making the time for every stage equal the time for the longest stage– making the time for every instruction equal the time for the longest instruction (not quite true, but true for a wide range of instructions)• Unfortunately, the speedup of pipelining is


View Full Document

UMD CMSC 411 - Unit 3 – Instruction Pipelining

Documents in this Course
Load more
Download Unit 3 – Instruction Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Unit 3 – Instruction Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Unit 3 – Instruction Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?