DOC PREVIEW
Berkeley COMPSCI 152 - Lecture Notes

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Last time in Lecture 9Complex Pipelining: MotivationFloating-Point Unit (FPU)Functional Unit CharacteristicsFloating-Point ISARealistic Memory SystemsIssues in Complex Pipeline ControlComplex In-Order PipelineIn-Order Superscalar PipelineTypes of Data HazardsRegister vs. Memory DependenceData Hazards: An ExampleInstruction SchedulingOut-of-order Completion In-order IssueComplex PipelineWhen is it Safe to Issue an Instruction?Slide 18Simplifying the Data Structure Assuming In-order IssueSimplifying the Data Structure ...Scoreboard for In-order IssuesScoreboard DynamicsIn-Order Issue Limitations: an exampleCS152 AdministriviaOut-of-Order IssueIssue Limitations: In-Order and Out-of-OrderHow many instructions can be in the pipeline?Overcoming the Lack of Register NamesInstruction-level Parallelism via RenamingRegister RenamingRenaming StructuresReorder Buffer ManagementRenaming & Out-of-order Issue An exampleIBM 360/91 Floating-Point Unit R. M. Tomasulo, 1967Effectiveness?AcknowledgementsFebruary 28, 2011 CS152, Spring 2011CS 152 Computer Architecture and Engineering Lecture 10 - Complex Pipelines,Out-of-Order Issue, Register RenamingKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California at Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.edu/~cs152February 28, 2011 CS152, Spring 20112Last time in Lecture 9•Modern page-based virtual memory systems provide:–Translation, Protection, Virtual memory.•Translation and protection information stored in page tables, held in main memory•Translation and protection information cached in “translation-lookaside buffer” (TLB) to provide single-cycle translation+protection check in common case•Virtual memory interacts with cache design–Physical cache tags require address translation before tag lookup, or use untranslated offset bits to index cache.–Virtual tags do not require translation before cache hit/miss determination, but need to be flushed or extended with ASID to cope with context swaps. Also, must deal with virtual address aliases (usually by disallowing copies in cache).February 28, 2011 CS152, Spring 20113Complex Pipelining: MotivationPipelining becomes complex when we want high performance in the presence of:• Long latency or partially pipelined floating-point units• Memory systems with variable access time• Multiple arithmetic and memory unitsFebruary 28, 2011 CS152, Spring 20114Floating-Point Unit (FPU)Much more hardware than an integer unitSingle-cycle FPU is a bad idea - why?• it is common to have several FPU’s• it is common to have different types of FPU’s Fadd, Fmul, Fdiv, ...• an FPU may be pipelined, partially pipelined or not pipelinedTo operate several FPU’s concurrently the FP register file needs to have more read and write portsFebruary 28, 2011 CS152, Spring 20115Functional Unit CharacteristicsfullypipelinedpartiallypipelinedFunctional units have internal pipeline registers operands are latched when an instruction enters a functional unit  following instructions are able to write register file during a long-latency operation1cyc1cyc1cyc2 cyc 2 cycFebruary 28, 2011 CS152, Spring 20116Floating-Point ISAInteraction between the floating-point datapathand the integer datapath is determined largelyby the ISAMIPS ISA • separate register files for FP and Integer instructionsthe only interaction is via a set of move instructions (some ISA’s don’t even permit this)• separate load/store for FPR’s and GPR’s but both use GPR’s for address calculation • separate conditions for branchesFP branches are defined in terms of condition codesFebruary 28, 2011 CS152, Spring 20117Realistic Memory Systems Latency of access to the main memory is usually much greater than one cycle and often unpredictableSolving this problem is a central issue in computer architecture Common approaches to improving memory performance• caches single cycle except in case of a miss stall• interleaved memory multiple memory accesses  bank conflicts• split-phase memory operations (separate memory request from response) out-of-order responsesFebruary 28, 2011 CS152, Spring 20118Issues in Complex Pipeline ControlIF ID WBALUMemFaddFmulFdivIssueGPRsFPRs• Structural conflicts at the execution stage if some FPU or memory unit is not pipelined and takes more than one cycle• Structural conflicts at the write-back stage due to variable latencies of different functional units• Out-of-order write hazards due to variable latencies of different functional units• How to handle exceptions?February 28, 2011 CS152, Spring 20119Complex In-Order PipelineDelay writeback so all operations have samelatency to W stage–Write ports never oversubscribed (one inst. in & one inst. out every cycle)–Stall pipeline on long latency operations, e.g., divides, cache misses–Handle exceptions in-order at commit pointCommit PointPCInst. MemDDecodeX1 X2Data MemW+GPRsX2 WFAddX3X3FPRsX1X2FMulX3X2FDiv X3Unpipelined dividerHow to prevent increased writeback latency from slowing down single cycle integer operations? BypassingFebruary 28, 2011 CS152, Spring 201110In-Order Superscalar Pipeline•Fetch two instructions per cycle; issue both simultaneously if one is integer/memory and other is floating point•Inexpensive way of increasing throughput, examples include Alpha 21064 (1992) & MIPS R5000 series (1996)•Same idea can be extended to wider issue by duplicating functional units (e.g. 4-issue UltraSPARC & Alpha 21164) but regfile ports and bypassing costs grow quicklyCommit Point2PCInst. MemDDualDecodeX1 X2Data MemW+GPRsX2 WFAddX3X3FPRsX1X2FMulX3X2FDiv X3Unpipelined dividerFebruary 28, 2011 CS152, Spring 201111Types of Data Hazards Consider executing a sequence of rk ri op rj type of instructionsData-dependencer3  r1 op r2 Read-after-Write r5  r3 op r4(RAW) hazardAnti-dependencer3  r1 op r2Write-after-Read r1  r4 op r5(WAR) hazardOutput-dependencer3  r1 op r2 Write-after-Write r3  r6 op r7 (WAW) hazardFebruary 28, 2011 CS152, Spring 201112Register vs. Memory DependenceData hazards due to register operands can bedetermined at the decode stage butdata hazards due to memory operands can bedetermined only after computing the effective addressstore M[r1 + disp1]  r2 load r3  M[r4 + disp2]Does (r1 + disp1) = (r4 + disp2) ?February 28, 2011 CS152, Spring 201113Data Hazards: An ExampleI1 DIVD f6,


View Full Document

Berkeley COMPSCI 152 - Lecture Notes

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?