UMD CMSC 411 - Lecture 4 Basic Pipelining - D2144987

Home> Schools> University of Maryland, College Park> Computer Science (CMSC) > CMSC 411> Lecture 4 Basic Pipelining

DOC PREVIEW

UMD CMSC 411 - Lecture 4 Basic Pipelining

School name University of Maryland, College Park

Course Cmsc 411- Computer Systems Architecture

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CMSC 411Computer Systems ArchitectureyLecture 4Basic PipeliningBasic Pipelining Alan [email protected]@csu deduAdministrivia• Homework problems for Unit 1 due ThursdayCMSC 411 - 4 (from Patterson)25 Steps of MIPS DatapathFigureA.17,PageA-29Figure A.17, Page A29MemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcMUXAddeNext SEQ PCNext PCLALMemReg FMUXM4erZero?AddrInstRS1RS2LMDLUoryFileMUXDataMemoryMUXSiresstRDSignExtendWB DataImmIR <= mem[PC];PC <= PC + 4Reg[IR] <= Reg[IR]opReg[IR]CMSC 411 - 4 (from Patterson)3Reg[IRrd] <= Reg[IRrs] opIRopReg[IRrt]5 Steps of MIPS DatapathFigure A.18, Page A-31g,gMemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcNext PCZero?4AdderNext SEQ PCNext SEQ PCNext PCMUXALUMemorReg FiMUXMDMeIF/IDID/EXMEM/WEX/ME4AddresRS1RS2UryleMUXDataemoryMUXSignExtendDXWBEMatassIR <= mem[PC]; PC <= PC + 4ExtendRD RD RDWB DImmPC <= PC + 4A <= Reg[IRrs]; B <= Reg[IRrt]rslt <= A opIRopBCMSC 411 - 4 (from Patterson)4pIRopReg[IRrd] <= WBWB <= rsltInst. Set Processor ControllerIR <= mem[PC]; PC <= PC + 4IfetchPC <= PC + 4A <= Reg[IRrs]; opFetch-DCDrsB <= Reg[IRrt]brjmpRR<AIRRIr<=A+IRLDr <= A opIRop BWB <= rPC <= IRjaddrif bop(A,b)PC <= PC+IRimr <= A opIRopIRimWB <= rr <= A + IRimWB <= Mem[r]Reg[IRrd] <= WB Reg[IRrd] <= WB Reg[IRrd] <= WBCMSC 411 - 4 (from Patterson)5Visualizing PipeliningFigure A.2, Page A-8Figure A.2, Page A8Time (clock cycles)Cl 1Cl 2Cl 3Cl 4Cl 6Cl 7Cl 5InsRegALUDMemIfetchRegCycle 1Cycle 2Cycle 3Cycle 4Cycle 6Cycle 7Cycle 5str.RegALUDMemIfetchRegOrdeRegALUDMemIfetchRegerRegALUIfetch DMemRegCMSC 411 - 4 (from Patterson)6Pipelining is not quite that easy!• Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle–Structural hazards: HW cannot support this combination ofStructural hazards: HW cannot support this combination of instructions (single person to fold and put clothes away)– Data hazards: Instruction depends on result of prior instruction still in the pipeline (missing sock)–Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).CMSC 411 - 4 (from Patterson)7One Memory Port/Structural HazardsFigure A.4, Page A-14Figure A.4, Page A14Time (clock cycles)ILoadRegALUDMemIfetchRegCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5nstInstr 1RegALUDMemIfetchRegr.OrInstr 2RegALUDMemIfetchRegrderInstr 3Instr 4RegALUDMemIfetchRegRegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)8rInstr 4RegADMemIfetchgOne Memory Port/Structural Hazards(Similar to Figure A.5, Page A-15)Time (clock cycles)Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5InLoadRegALUDMemIfetchRegstr.Instr 1It2RegALUDMemIfetchRegRegLUDMemIfetchRegOrdInstr 2StallRegALDMemIfetchRegBubble Bubble Bubble BubbleBubblederInstr 3RegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)9How do you “bubble” the pipe?Speed Up Equation for PipeliningInstper cycles Stall Average CPI Ideal CPIpipelined+=pipelineddunpipeline TimeCycle TimeCycle CPI stall Pipeline CPI Idealdepth Pipeline CPI Ideal Speedup ×+×=For simple RISC pipeline, ideal CPI = 1:pipelineddunpipeline TimeCycle TimeCycle CPI stall Pipeline 1depth Pipeline Speedup ×+=CMSC 411 - 4 (from Patterson)10ppExample: Dual-port vs. Single-portppgp• Machine A: Dual ported memory (“Harvard Architecture”)Machine B: Single ported memory but its pipelined•Machine B: Single ported memory, but its pipelined implementation has a 1.05 times faster clock rate• Ideal CPI = 1 for both• Loads are 40% of instructions executedSpeedUpA= Pipeline Depth/(1 + 0) x (clockunpipe/clockpipe)= Pipeline Depth= Pipeline DepthSpeedUpB= Pipeline Depth/(1 + 0.4 x 1) x (clockunpipe/(clockunpipe / 1.05)= (Pipeline Depth/1.4) x 1.05075 Pi li D th= 0.75 x Pipeline DepthSpeedUpA/ SpeedUpB= Pipeline Depth/(0.75 x Pipeline Depth) = 1.33• Machine A is 1.33 times faster CMSC 411 - 4 (from Patterson)11Data Hazard on R1Figure A.6, Page A-16Time (clock cycles)IFID/RFEXMEMWBInadd r1,r2,r3RegALUDMemIfetchRegIFID/RFEXMEMWBstr.sub r4,r1,r3RegALUDMemIfetchRegOrdand r6,r1,r7RegALUDMemIfetchRegUderor r8,r1,r9xor r10r1r11RegALUDMemIfetchRegRegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)12xor r10,r1,r11gAgThree Generic Data Hazards• Read After Write (RAW)ead te te ( )InstrJtries to read operand before InstrIwrites itI: add r1,r2,r3J: sub r4,r1,r3• Caused by a “Dependence” (in compiler nomenclature). This hazard results from an actual need for communicationneed for communication.CMSC 411 - 4 (from Patterson)13Three Generic Data Hazards• Write After Read (WAR)Itit dbfItditInstrJwrites operand beforeInstrIreads itI: sub r4,r1,r3 J: addr1r2 r3• Called an “anti-dependence” by compiler writers.J: add r1,r2,r3K: mul r6,r1,r7pypThis results from reuse of the name “r1”.•Can’t happen in MIPS 5 stage pipeline because:•Can t happen in MIPS 5 stage pipeline because:– All instructions take 5 stages, and–Reads are always in stage 2, andCMSC 411 - 4 (from Patterson)14Reads are always in stage 2, and – Writes are always in stage 5Three Generic Data Hazards• Write After Write (WAW)InstrJwrites operandbeforeInstrIwrites it.InstrJwrites operand beforeInstrIwrites it.I: sub r1,r4,r3 J: addr1,r2,r3• Called an “output dependence” by compiler writersJ: add r1,r2,r3K: mul r6,r1,r7This also results from the reuse of name “r1”.• Can’t happen in MIPS 5 stage pipeline because: All instructions take 5 stages and–All instructions take 5 stages, and – Writes are always in stage 5•Will see WAR and WAW in more complicated pipesCMSC 411 - 4 (from Patterson)15Will see WAR and WAW in more complicated pipesForwarding to Avoid Data HazardFigure A.7, Page A-18Time (clock cycles)IInstadd r1,r2,r3RegALUDMemIfetchRegr.Orsub r4,r1,r361RegLUDMemIfetchRegRegALUDMemIfetchRegrderand r6,r1,r7or r8r1r9RegALDMemIfetchRegRegALUDMemIfetchRegor r8,r1,r9xor r10,r1,r11RegALUDMemIfetchRegCMSC 411 - 4 (from Patterson)16HW Change for ForwardingFigure A.23, Page A-37g,gNextPCMIEXAmuxRegMEM/WRID/EXX/MEM DataMemoryALUmuxistersImmediatemuxCMSC 411 - 4 (from

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 5 pages.

UMD CMSC 411 - Lecture 4 Basic Pipelining

Sign up for free to view:

Please select your school