UCLA COMSCI M151B - lec5-c4 - D3104406

Home> Schools> University of California, Los Angeles> Computer Science (COMSCI) > COMSCI M151B> lec5-c4

DOC PREVIEW

UCLA COMSCI M151B - lec5-c4

School name University of California, Los Angeles

Course Comsci M151b- Computer Systems Architecture

Pages 33

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 33 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

4_44_54_64_7Chapter 4The ProcessorChapter 4 — The Processor — 2Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory  register file  ALU data memory  register file Not feasible to vary period for different instructions Violates design principle Making the common case fast We will improve performance by pipeliningChapter 4 — The Processor — 3Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance§4.5 An Overview of Pipelining Four loads: Speedup= 8/3.5 = 2.3 Non-stop: Speedup= 2n/0.5n + 1.5 ≈ 4= number of stagesChapter 4 — The Processor — 4MIPS Pipeline Five stages, one step per stage1. IF: Instruction fetch from memory2. ID: Instruction decode & register read3. EX: Execute operation or calculate address4. MEM: Access memory operand5. WB: Write result back to registerChapter 4 — The Processor — 5MIPS Pipelined Datapath§4.6 Pipelined Datapath and ControlWBMEMRight-to-left flow leads to hazardsExecution in a Pipelined DatapathCC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9lwlwlwlwlwsteadystateIF ID EX MEM WBIM RegALUDM RegIF ID EX MEM WBIM RegALUDM RegIF ID EX MEM WBIM RegALUDM RegIF ID EX MEM WBIM RegALUDM RegIF ID EX MEM WBIM RegALUDM RegChapter 4The ProcessorChapter 4 — The Processor — 2Pipeline Performance Assume time for stages is 100ps for register read or write 200ps for other stages Compare pipelined datapath with single-cycle datapathInstr Instr fetch Register readALU op Memory accessRegister writeTotal timelw 200ps 100 ps 200ps 200ps 100 ps 800pssw 200ps 100 ps 200ps 200ps 700psR-format 200ps 100 ps 200ps 100 ps 600psbeq 200ps 100 ps 200ps 500psChapter 4 — The Processor — 3Pipeline PerformanceSingle-cycle (Tc= 800ps)Pipelined (Tc= 200ps)Chapter 4 — The Processor — 4Pipeline Speedup If all stages are balanced i.e., all take the same time Time between instructionspipelined= Time between instructionsnonpipelinedNumber of stages If not balanced, speedup is less Speedup due to increased throughput Latency (time for each instruction) does not decreaseMixed Instructions in the PipelineIM RegALURegIM RegALUDM RegCC1 CC2 CC3 CC4 CC5 CC6lwaddChapter 4 — The Processor — 6Pipelining and ISA Design MIPS ISA designed for pipelining All instructions are 32-bits Easier to fetch and decode in one cycle c.f. x86: 1- to 17-byte instructions Few and regular instruction formats Can decode and read registers in one step Load/store addressing Can calculate address in 3rdstage, access memory in 4thstage Alignment of memory operands Memory access takes only one cyclePipeline Principles All instructions that share a pipeline must have the same stages in the same order. therefore, add does nothing during Mem stage sw does nothing during WB stage All intermediate values must be latched each cycle. There is no functional block reuse example: we need 2 adders and ALU (like in single-cycle) IM RegALUDM RegIF ID EX MEM WBChapter 4 — The Processor — 8Pipeline registers Need registers between stages To hold information produced in previous cycleChapter 4The ProcessorChapter 4 — The Processor — 2Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. “multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load & storeChapter 4 — The Processor — 3IF for Load, Store, …Chapter 4 — The Processor — 4ID for Load, Store, …Chapter 4 — The Processor — 5EX for LoadChapter 4 — The Processor — 6MEM for LoadChapter 4 — The Processor — 7WB for LoadWrongregisternumberChapter 4 — The Processor — 8Corrected Datapath for LoadChapter 4 — The Processor — 9EX for StoreChapter 4 — The Processor — 10MEM for StoreChapter 4 — The Processor — 11WB for StoreChapter 4The ProcessorChapter 4 — The Processor — 2Multi-Cycle Pipeline Diagram Form showing resource usageChapter 4 — The Processor — 3Multi-Cycle Pipeline Diagram Traditional formChapter 4 — The Processor — 4Single-Cycle Pipeline Diagram State of pipeline in a given cycleChapter 4 — The Processor — 5Pipelined Control (Simplified)Chapter 4 — The Processor — 6Pipelined Control Control signals derived from instruction As in single-cycle implementationChapter 4 — The Processor — 7Pipelined ControlPipelined Control SignalsExecution Stage Control LinesMemory Stage Control LinesWrite Back Stage Control LinesInstruction RegDst ALUOp1ALUOp0ALUSrc Branch MemReadMemWriteRegWrite MemtoRegR-Format110 0 0 00 1 0lw000 1 0 10 1 1swx00 1 0 01 0 xbeqx01 0 1 00 0

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

UCLA COMSCI M151B - lec5-c4

Sign up for free to view:

Please select your school