Rutgers University ECE 331 - Introduction to Pipelined Datapath - D2119429

Home> Schools> Rutgers University- The State University of New Jersey> (ECE) > ECE 331> Introduction to Pipelined Datapath

DOC PREVIEW

Rutgers University ECE 331 - Introduction to Pipelined Datapath

School name Rutgers University- The State University of New Jersey

Course Ece 331- Computer Architecture and Assembly Language

Pages 42

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 42 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 12 Introduction to Pipelined DatapathReview: Multicycle Data and Control PathReview: RTL SummaryReview: Multicycle Datapath FSMReview: FSM ImplementationSingle Cycle Disadvantages & AdvantagesMulticycle Advantages & DisadvantagesThe Five Stages of Load InstructionSingle Cycle vs. Multiple Cycle TimingPipelined MIPS ProcessorSingle Cycle, Multiple Cycle, vs. PipelinePipelining the MIPS ISAMIPS Pipeline Datapath ModificationsMIPS Pipeline Control Path ModificationsGraphically Representing MIPS PipelineWhy Pipeline? For Throughput!Can pipelining get us into trouble?A Unified Memory Would Be a Structural HazardHow About Register File Access?Register Usage Can Cause Data HazardsOne Way to “Fix” a Data HazardAnother Way to “Fix” a Data HazardLoads Can Cause Data HazardsStores Can Cause Data HazardsForwarding with Load-use Data HazardsBranch Instructions Cause Control HazardsOne Way to “Fix” a Control HazardOther Pipeline Structures Are PossibleSample Pipeline AlternativesSummaryPerformanceTwo notions of “performance”DefinitionsExampleBasis of EvaluationSPEC95Metrics of performanceAspects of CPU PerformanceCPIExample (RISC processor)Amdahl's LawSummary: Evaluating Instruction Sets?Spring 2006331 W12.114:332:331Computer Architecture and Assembly LanguageSpring 2006Week 12Introduction to Pipelined Datapath[Adapted from Dave Patterson’s UCB CS152 slides andMary Jane Irwin’s PSU CSE331 slides]Spring 2006331 W12.2Review: Multicycle Data and Control PathAddressRead Data(Instr. or Data)MemoryPCWrite DataRead Addr 1Read Addr 2Write AddrRegisterFileRead Data 1Read Data 2ALUWrite DataIRMDRABALUoutSignExtendShiftleft 2ALUcontrolShiftleft 2ALUOpControlFSMIRWriteMemtoRegMemWriteMemReadIorDPCWritePCWriteCondRegDstRegWriteALUSrcAALUSrcBzeroPCSource1111110000002234Instr[5-0]Instr[25-0]PC[31-28]Instr[15-0]Instr[31-26]3228Spring 2006331 W12.3Review: RTL SummaryStep R-type Mem Ref Branch JumpInstr fetchIR = Memory[PC]; PC = PC + 4;DecodeA = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC +(sign-extend(IR[15-0])<< 2);ExecuteALUOut = A op B;ALUOut = A + sign-extend (IR[15-0]);if (A==B) PC = ALUOut; PC = PC[31-28] ||(IR[25-0] << 2);Memory accessReg[IR[15-11]] = ALUOut;MDR = Memory[ALUOut]; orMemory[ALUOut] = B; Write-backReg[IR[20-16]] = MDR;Spring 2006331 W12.4Review: Multicycle Datapath FSMStartInstr FetchDecodeWrite BackMemory AccessExecute(Op = R-type)(Op = beq)(Op = lw or sw)(Op = j)(Op = lw)(Op = sw)0 123456789Unless otherwise assigned PCWrite,IRWrite, MemWrite,RegWrite=0 others=XIorD=0MemRead;IRWriteALUSrcA=0ALUsrcB=01PCSource,ALUOp=00PCWriteALUSrcA=0ALUSrcB=11ALUOp=00PCWriteCond=0ALUSrcA=1ALUSrcB=10ALUOp=00PCWriteCond=0ALUSrcA=1ALUSrcB=00ALUOp=10PCWriteCond=0ALUSrcA=1ALUSrcB=00ALUOp=01PCSource=01PCWriteCondPCSource=10PCWriteMemReadIorD=1PCWriteCond=0MemWriteIorD=1PCWriteCond=0RegDst=1RegWriteMemtoReg=0PCWriteCond=0RegDst=0RegWriteMemtoReg=1PCWriteCond=0Spring 2006331 W12.5Review: FSM ImplementationCombinationalcontrol logicState RegInst[31-26]NextStateInputsOutputsOp0Op1Op2Op3Op4Op5PCWritePCWriteCondIorDMemReadMemWriteIRWriteMemtoRegPCSourceALUOpALUSourceBALUSourceARegWriteRegDstSystem ClockSpring 2006331 W12.6Single Cycle Disadvantages & AdvantagesUses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instructionIs wasteful of area since some functional units must (e.g., adders) be duplicated since they can not be shared during a clock cyclebutIs simple and easy to understandClkSingle Cycle Implementation:lw sw WasteCycle 1 Cycle 2Spring 2006331 W12.7Multicycle Advantages & DisadvantagesUses the clock cycle efficiently – the clock cycle is timed to accommodate the slowest instruction stepbalance the amount of work to be done in each steprestrict each step to use only one major functional unitMulticycle implementations allowfunctional units to be used more than once per instruction as long as they are used on different clock cyclesfaster clock ratesdifferent instructions to take a different number of clock cyclesbutRequires additional internal state registers, muxes, and more complicated (FSM) controlSpring 2006331 W12.8The Five Stages of Load InstructionIFetch: Instruction Fetch and Update PCDec: Registers Fetch and Instruction DecodeExec: Execute R-type; calculate memory addressMem: Read/write the data from/to the Data MemoryWB: Write the data back to the register fileCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5IFetch Dec Exec Mem WBlwSpring 2006331 W12.9Single Cycle vs. Multiple Cycle TimingClkCycle 1Multiple Cycle Implementation:IFetch Dec Exec Mem WBCycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10IFetch Dec Exec Memlw swClkSingle Cycle Implementation:lw sw WasteIFetchR-typeCycle 1 Cycle 2multicycle clock slower than 1/5th of single cycle clock due to stage flipflop overheadSpring 2006331 W12.10Pipelined MIPS ProcessorStart the next instruction while still working on the current oneimproves throughput - total amount of work done in a given timeinstruction latency (execution time, delay time, response time) is not reduced - time from the start of an instruction to its completionCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5IFetch Dec Exec Mem WBlwCycle 7Cycle 6 Cycle 8swIFetch Dec Exec Mem WBR-typeIFetch Dec Exec Mem WBSpring 2006331 W12.11Single Cycle, Multiple Cycle, vs. PipelineClkCycle 1Multiple Cycle Implementation:IFetch Dec Exec Mem WBCycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9Cycle 10lwIFetch Dec Exec Mem WBIFetch Dec Exec Memlw swPipeline Implementation:IFetch Dec Exec Mem WBswClkSingle Cycle Implementation:Load Store WasteIFetchR-typeIFetch Dec Exec Mem WBR-typeCycle 1 Cycle 2wasted cycleSpring 2006331 W12.12Pipelining the MIPS ISAWhat makes it easyall instructions are the same length (32 bits)few instruction formats (three) with symmetry across formatsmemory operations can occur only in loads and storesoperands must be aligned in memory so a single data transfer requires only one memory accessWhat makes it hardstructural hazards: what if we had only one memorycontrol hazards: what about branchesdata hazards: what if an instruction’s input operands depend on the output of a previous instructionSpring 2006331 W12.13MIPS Pipeline Datapath

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Rutgers University ECE 331 - Introduction to Pipelined Datapath

Sign up for free to view:

Please select your school