Berkeley COMPSCI 61C - Instruction Level Parallelism— The Datapath - D2705903

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 61C> Instruction Level Parallelism— The Datapath

DOC PREVIEW

Berkeley COMPSCI 61C - Instruction Level Parallelism— The Datapath

School name University of California, Berkeley

Course Compsci 61c- Machine Structures

Pages 59

This preview shows page 1-2-3-4-27-28-29-30-56-57-58-59 out of 59 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 59 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1You Are Here!AgendaAgendaReview: RISC Design PrinciplesReview: Single-Cycle ProcessorSingle Cycle PerformancePipeline Analogy: Doing LaundrySequential LaundryPipelined LaundryPipelining Lessons (1/2)Pipelining Lessons (2/2)AgendaAdministriviaAgendaReview: Single Cycle DatapathSteps in Executing MIPSRedrawn Single-Cycle DatapathPipelined DatapathMore Detailed PipelineIF for Load, Store, …ID for Load, Store, …EX for LoadMEM for LoadWB for LoadCorrected Datapath for LoadAgendaPipelined Execution RepresentationGraphical Pipeline DiagramsGraphical Pipeline RepresentationPipeline PerformancePipeline PerformancePipeline SpeedupInstruction Level Parallelism (ILP)Hazards1. Structural Hazards1. Structural Hazard #1: Single Memory1. Structural Hazard #2: Registers (1/2)1. Structural Hazard #2: Registers (2/2)2. Data HazardsForwarding (aka Bypassing)Corrected Datapath for Forwarding?Load-Use Data HazardAgendaAgendaPipelining and ISA DesignSlide 503. Control HazardsStall => 2 Bubbles/Clocks3. Control Hazard: BranchingCorrected Datapath for BEQ/BNE?One Clock Cycle Stall3. Control Hazards3. Control Hazard: Branching3. Control Hazard: BranchingExample: Nondelayed vs. Delayed BranchDelayed Branch/Jump and MIPS ISA?Code Scheduling to Avoid StallsAnd in Conclusion, …CS 61C: Great Ideas in Computer Architecture (Machine Structures)Instruction Level Parallelism—The DatapathInstructors:Randy H. KatzDavid A. Pattersonhttp://inst.eecs.Berkeley.edu/~cs61c/fa1001/14/2019 1Spring 2011 -- Lecture #20You Are Here!•Parallel RequestsAssigned to computere.g., Search “Katz”•Parallel ThreadsAssigned to coree.g., Lookup, Ads•Parallel Instructions>1 instruction @ one timee.g., 5 pipelined instructions•Parallel Data>1 data item @ one timee.g., Add of 4 pairs of words•Hardware descriptionsAll gates functioning in parallel at same time01/14/2019 Spring 2011 -- Lecture #20 3SmartPhoneWarehouse Scale ComputerSoftware HardwareHarnessParallelism &Achieve HighPerformanceLogic Gates CoreCoreCoreCore… Memory (Cache) Memory (Cache)Input/OutputInput/OutputComputerMain MemoryMain MemoryCore Instruction Unit(s) Instruction Unit(s) FunctionalUnit(s) FunctionalUnit(s)A3+B3A2+B2A1+B1A0+B0Today’sLectureAgenda•Pipelined Execution•Administrivia•Pipelined Datapath•Pipeline Hazards•Technology Break•Pipelining and Instruction Set Design•Summary01/14/2019 4Spring 2011 -- Lecture #20Agenda•Pipelined Execution•Administrivia•Pipelined Datapath•Pipeline Hazards•Technology Break•Pipelining and Instruction Set Design•Summary01/14/2019 5Spring 2011 -- Lecture #20Review: RISC Design Principles•“A simpler core is a faster core”•Reduction in the number and complexity of instructions in the ISA  simplifies pipelined implementation•Common RISC strategies:–Fixed instruction length, generally a single word; Simplifies process of fetching instructions from memory–Simplified addressing modes;Simplifies process of fetching operands from memory–Fewer and simpler instructions in the instruction set;Simplifies process of executing instructions–Simplified memory access: only load and store instructions access memory;–Let the compiler do it. Use a good compiler to break complex high-level language statements into a number of simple assembly language statements01/14/2019 Spring 2011 -- Lecture #5 6Review: Single-Cycle Processor•Five steps to design a processor:1. Analyze instruction set  datapath requirements2. Select set of datapath components & establish clock methodology3. Assemble datapath meeting the requirements: re-examine for pipelining4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.5. Assemble the control logic•Formulate Logic Equations•Design Circuits01/14/2019 Spring 2011 -- Lecture #20 7ControlDatapathMemoryProcessorInputOutput01/14/2019 Spring 2011 -- Lecture #20Single Cycle Performance•Assume time for actions are–100ps for register read or write; 200ps for other events•Clock rate is?Instr Instr fetchRegister readALU op Memory accessRegister writeTotal timelw 200ps 100 ps 200ps 200ps 100 ps 800pssw 200ps 100 ps 200ps 200ps 700psR-format 200ps 100 ps 200ps 100 ps 600psbeq 200ps 100 ps 200ps 500ps8• What can we do to improve clock rate?• Will this improve performance as well?Want increased clock rate to mean faster programsStudent Roulette?Pipeline Analogy: Doing Laundry•Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, fold, and put away–Washer takes 30 minutes–Dryer takes 30 minutes–“Folder” takes 30 minutes–“Stasher” takes 30 minutes to put clothes into drawersA B C D01/14/2019 9Spring 2011 -- Lecture #20Sequential Laundry•Sequential laundry takes 8 hours for 4 loadsTaskOrderBCDA30Time3030 3030 30 3030 3030 3030 3030 30306 PM78910111212 AM01/14/2019 10Spring 2011 -- Lecture #20Pipelined Laundry•Pipelined laundry takes 3.5 hours for 4 loads! TaskOrderBCDA122 AM6 PM78910111Time303030 3030303001/14/2019 11Spring 2011 -- Lecture #20•Pipelining doesn’t help latency of single task, it helps throughput of entire workload•Multiple tasks operating simultaneously using different resources•Potential speedup = Number pipe stages•Time to fill pipeline and time to drain it reduces speedup:2.3X v. 4X in this example6 PM7 8 9TimeBCDA303030 30303030TaskOrderPipelining Lessons (1/2)01/14/2019 12Spring 2011 -- Lecture #20•Suppose new Washer takes 20 minutes, new Stasher takes 20 minutes. How much faster is pipeline?•Pipeline rate limited by slowest pipeline stage•Unbalanced lengths of pipe stages reduces speedup6 PM7 8 9TimeBCDA303030 30303030TaskOrderPipelining Lessons (2/2)01/14/2019 13Spring 2011 -- Lecture #20Agenda•Pipelined Execution•Administrivia•Pipelined Datapath•Pipeline Hazards•Technology Break•Pipelining and Instruction Set Design•Summary01/14/2019 14Spring 2011 -- Lecture #20Administrivia•Project 4: Pipelined Cycle Processor in Logicsim–Due Part 1, datapath, due 4/10, Part 2 due 4/17–Face-to-Face grading: Signup for timeslot last week•Extra Credit: Fastest Version of Project 3–Due 4/24 23:59:59•Final Review: TBD•Final: Mon May 9 11AM-2PM (TBD)01/14/2019 Spring 2011 -- Lecture #20 15Agenda•Pipelined Execution•Administrivia•Pipelined Datapath•Pipeline Hazards•Technology Break•Pipelining and Instruction Set

View Full Document

Berkeley COMPSCI 61C - Instruction Level Parallelism— The Datapath

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-27-28-29-30-56-57-58-59 out of 59 pages.

Berkeley COMPSCI 61C - Instruction Level Parallelism— The Datapath

Sign up for free to view:

Please select your school