DOC PREVIEW
UMD CMSC 411 - Lecture 4 Basic Pipelining

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CMSC 411Computer Systems ArchitectureyLecture 4Basic PipeliningBasic Pipelining Alan [email protected]@csu deduAdministrivia• Homework problems for Unit 1 due ThursdayCMSC 411 - 4 (from Patterson)25 Steps of MIPS DatapathFigureA.17,PageA-29Figure A.17, Page A29MemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcMUXAddeNext SEQ PCNext PCLALMemReg FMUXM4erZero?AddrInstRS1RS2LMDLUoryFileMUXDataMemoryMUXSiresstRDSignExtendWB DataImmIR <= mem[PC];PC <= PC + 4Reg[IR] <= Reg[IR]opReg[IR]CMSC 411 - 4 (from Patterson)3Reg[IRrd] <= Reg[IRrs] opIRopReg[IRrt]5 Steps of MIPS DatapathFigure A.18, Page A-31g,gMemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcNext PCZero?4AdderNext SEQ PCNext SEQ PCNext PCMUXALUMemorReg FiMUXMDMeIF/IDID/EXMEM/WEX/ME4AddresRS1RS2UryleMUXDataemoryMUXSignExtendDXWBEMatassIR <= mem[PC]; PC <= PC + 4ExtendRD RD RDWB DImmPC <= PC + 4A <= Reg[IRrs]; B <= Reg[IRrt]rslt <= A opIRopBCMSC 411 - 4 (from Patterson)4pIRopReg[IRrd] <= WBWB <= rsltInst. Set Processor ControllerIR <= mem[PC]; PC <= PC + 4IfetchPC <= PC + 4A <= Reg[IRrs]; opFetch-DCDrsB <= Reg[IRrt]brjmpRR<AIRRIr<=A+IRLDr <= A opIRop BWB <= rPC <= IRjaddrif bop(A,b)PC <= PC+IRimr <= A opIRopIRimWB <= rr <= A + IRimWB <= Mem[r]Reg[IRrd] <= WB Reg[IRrd] <= WB Reg[IRrd] <= WBCMSC 411 - 4 (from Patterson)5Visualizing PipeliningFigure A.2, Page A-8Figure A.2, Page A8Time (clock cycles)Cl 1Cl 2Cl 3Cl 4Cl 6Cl 7Cl 5InsRegALUDMemIfetchRegCycle 1Cycle 2Cycle 3Cycle 4Cycle 6Cycle 7Cycle 5str.RegALUDMemIfetchRegOrdeRegALUDMemIfetchRegerRegALUIfetch DMemRegCMSC 411 - 4 (from Patterson)6Pipelining is not quite that easy!• Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle–Structural hazards: HW cannot support this combination ofStructural hazards: HW cannot support this combination of instructions (single person to fold and put clothes away)– Data hazards: Instruction depends on result of prior instruction still in the pipeline (missing sock)–Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).CMSC 411 - 4 (from Patterson)7One Memory Port/Structural HazardsFigure A.4, Page A-14Figure A.4, Page A14Time (clock cycles)ILoadRegALUDMemIfetchRegCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5nstInstr 1RegALUDMemIfetchRegr.OrInstr 2RegALUDMemIfetchRegrderInstr 3Instr 4RegALUDMemIfetchRegRegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)8rInstr 4RegADMemIfetchgOne Memory Port/Structural Hazards(Similar to Figure A.5, Page A-15)Time (clock cycles)Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5InLoadRegALUDMemIfetchRegstr.Instr 1It2RegALUDMemIfetchRegRegLUDMemIfetchRegOrdInstr 2StallRegALDMemIfetchRegBubble Bubble Bubble BubbleBubblederInstr 3RegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)9How do you “bubble” the pipe?Speed Up Equation for PipeliningInstper cycles Stall Average CPI Ideal CPIpipelined+=pipelineddunpipeline TimeCycle TimeCycle CPI stall Pipeline CPI Idealdepth Pipeline CPI Ideal Speedup ×+×=For simple RISC pipeline, ideal CPI = 1:pipelineddunpipeline TimeCycle TimeCycle CPI stall Pipeline 1depth Pipeline Speedup ×+=CMSC 411 - 4 (from Patterson)10ppExample: Dual-port vs. Single-portppgp• Machine A: Dual ported memory (“Harvard Architecture”)Machine B: Single ported memory but its pipelined•Machine B: Single ported memory, but its pipelined implementation has a 1.05 times faster clock rate• Ideal CPI = 1 for both• Loads are 40% of instructions executedSpeedUpA= Pipeline Depth/(1 + 0) x (clockunpipe/clockpipe)= Pipeline Depth= Pipeline DepthSpeedUpB= Pipeline Depth/(1 + 0.4 x 1) x (clockunpipe/(clockunpipe / 1.05)= (Pipeline Depth/1.4) x 1.05075 Pi li D th= 0.75 x Pipeline DepthSpeedUpA/ SpeedUpB= Pipeline Depth/(0.75 x Pipeline Depth) = 1.33• Machine A is 1.33 times faster CMSC 411 - 4 (from Patterson)11Data Hazard on R1Figure A.6, Page A-16Time (clock cycles)IFID/RFEXMEMWBInadd r1,r2,r3RegALUDMemIfetchRegIFID/RFEXMEMWBstr.sub r4,r1,r3RegALUDMemIfetchRegOrdand r6,r1,r7RegALUDMemIfetchRegUderor r8,r1,r9xor r10r1r11RegALUDMemIfetchRegRegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)12xor r10,r1,r11gAgThree Generic Data Hazards• Read After Write (RAW)ead te te ( )InstrJtries to read operand before InstrIwrites itI: add r1,r2,r3J: sub r4,r1,r3• Caused by a “Dependence” (in compiler nomenclature). This hazard results from an actual need for communicationneed for communication.CMSC 411 - 4 (from Patterson)13Three Generic Data Hazards• Write After Read (WAR)Itit dbfItditInstrJwrites operand beforeInstrIreads itI: sub r4,r1,r3 J: addr1r2 r3• Called an “anti-dependence” by compiler writers.J: add r1,r2,r3K: mul r6,r1,r7pypThis results from reuse of the name “r1”.•Can’t happen in MIPS 5 stage pipeline because:•Can t happen in MIPS 5 stage pipeline because:– All instructions take 5 stages, and–Reads are always in stage 2, andCMSC 411 - 4 (from Patterson)14Reads are always in stage 2, and – Writes are always in stage 5Three Generic Data Hazards• Write After Write (WAW)InstrJwrites operandbeforeInstrIwrites it.InstrJwrites operand beforeInstrIwrites it.I: sub r1,r4,r3 J: addr1,r2,r3• Called an “output dependence” by compiler writersJ: add r1,r2,r3K: mul r6,r1,r7This also results from the reuse of name “r1”.• Can’t happen in MIPS 5 stage pipeline because: All instructions take 5 stages and–All instructions take 5 stages, and – Writes are always in stage 5•Will see WAR and WAW in more complicated pipesCMSC 411 - 4 (from Patterson)15Will see WAR and WAW in more complicated pipesForwarding to Avoid Data HazardFigure A.7, Page A-18Time (clock cycles)IInstadd r1,r2,r3RegALUDMemIfetchRegr.Orsub r4,r1,r361RegLUDMemIfetchRegRegALUDMemIfetchRegrderand r6,r1,r7or r8r1r9RegALDMemIfetchRegRegALUDMemIfetchRegor r8,r1,r9xor r10,r1,r11RegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)16HW Change for ForwardingFigure A.23, Page A-37g,gNextPCMIEXAmuxRegMEM/WRID/EXX/MEM DataMemoryALUmuxistersImmediatemuxCMSC 411 - 4 (from


View Full Document

UMD CMSC 411 - Lecture 4 Basic Pipelining

Documents in this Course
Load more
Download Lecture 4 Basic Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 4 Basic Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 4 Basic Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?