DOC PREVIEW
Berkeley COMPSCI 252 - Lecture Notes

This preview shows page 1-2-3-4-30-31-32-33-34-61-62-63-64 out of 64 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS252 Graduate Computer Architecture Lecture 1 Review of Pipelines, Performance, Caches, and Virtual Memory(!)Coping with CS 252Pipelining: Its Natural!Sequential LaundryPipelined Laundry Start work ASAPPipelining LessonsComputer PipelinesA "Typical" RISCExample: MIPS (Note register location)5 Steps of MIPS Datapath Figure 3.1, Page 130, CA:AQA 2e5 Steps of MIPS Datapath Figure 3.4, Page 134 , CA:AQA 2eVisualizing Pipelining Figure 3.3, Page 133 , CA:AQA 2eIts Not That Easy for ComputersOne Memory Port/Structural Hazards Figure 3.6, Page 142 , CA:AQA 2eOne Memory Port/Structural Hazards Figure 3.7, Page 143 , CA:AQA 2eData Hazard on R1 Figure 3.9, page 147 , CA:AQA 2eThree Generic Data HazardsSlide 18Slide 19CS 252 AdministriviaResearch Paper ReadingGradingForwarding to Avoid Data Hazard Figure 3.10, Page 149 , CA:AQA 2eHW Change for Forwarding Figure 3.20, Page 161, CA:AQA 2eData Hazard Even with Forwarding Figure 3.12, Page 153 , CA:AQA 2eData Hazard Even with Forwarding Figure 3.13, Page 154 , CA:AQA 2eSoftware Scheduling to Avoid Load HazardsControl Hazard on Branches Three Stage StallExample: Branch Stall ImpactPipelined MIPS Datapath Figure 3.22, page 163, CA:AQA 2/eFour Branch Hazard AlternativesSlide 32Delayed BranchNow, Review of PerformanceWhich is faster?DefinitionsAspects of CPU Performance (CPU Law)Cycles Per Instruction (Throughput)Example: Calculating CPISlide 40Example 2: Speed Up Equation for PipeliningExample 3: Evaluating Branch Alternatives (for 1 program)Example 4: Dual-port vs. Single-portNow, Review of Memory HierarchyRecap: Who Cares About the Memory Hierarchy?Levels of the Memory HierarchyThe Principle of LocalityMemory Hierarchy: TerminologyCache MeasuresSimplest Cache: Direct Mapped1 KB Direct Mapped Cache, 32B blocksTwo-way Set Associative CacheDisadvantage of Set Associative Cache4 Questions for Memory HierarchyQ1: Where can a block be placed in the upper level?Q2: How is a block found if it is in the upper level?Q3: Which block should be replaced on a miss?Q4: What happens on a write?Write Buffer for Write ThroughA Modern Memory HierarchySummary #1/4: Pipelining & PerformanceSummary #2/4: CachesSummary #3/4: The Cache Design SpaceReview #4/4: TLB, Virtual MemoryCS252/PattersonLec 1.11/17/01January 17, 2001Prof. David A. PattersonComputer Science 252Spring 2001CS252Graduate Computer ArchitectureLecture 1 Review of Pipelines, Performance, Caches, and Virtual Memory(!)CS252/PattersonLec 1.21/17/01Coping with CS 252•Students with too varied background?–In past, CS grad students took written prelim exams on undergraduate material in hardware, software, and theory–1st 5 weeks reviewed background, helped 252, 262, 270–Prelims were dropped => some unprepared for CS 252?•In class exam on Friday January 19 (30 mins)–Doesn’t affect grade, only admission into class–2 grades: Admitted or audit/take CS 152 1st–Improve your experience if recapture common background•Review: Chapters 1, CS 152 home page, maybe “Computer Organization and Design (COD)2/e” –Chapters 1 to 8 of COD if never took prerequisite–If took a class, be sure COD Chapters 2, 6, 7 are familiar–Copies in Bechtel Library on 2-hour reserve•FAST review today of Pipelining, Performance, Caches, and Virtual MemoryCS252/PattersonLec 1.31/17/01Pipelining: Its Natural!•Laundry Example•Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold•Washer takes 30 minutes•Dryer takes 40 minutes•“Folder” takes 20 minutesA B C DCS252/PattersonLec 1.41/17/01Sequential Laundry•Sequential laundry takes 6 hours for 4 loads•If they learned pipelining, how long would laundry take? ABCD30 40 20 30 40 20 30 40 20 30 40 206 PM7 8 91011MidnightTaskOrderTimeCS252/PattersonLec 1.51/17/01Pipelined LaundryStart work ASAP•Pipelined laundry takes 3.5 hours for 4 loads ABCD6 PM7 8 91011MidnightTaskOrderTime30 40 40 40 40 20CS252/PattersonLec 1.61/17/01Pipelining Lessons•Pipelining doesn’t help latency of single task, it helps throughput of entire workload•Pipeline rate limited by slowest pipeline stage•Multiple tasks operating simultaneously•Potential speedup = Number pipe stages•Unbalanced lengths of pipe stages reduces speedup•Time to “fill” pipeline and time to “drain” it reduces speedupABCD6 PM7 8 9TaskOrderTime30 40 40 40 40 20CS252/PattersonLec 1.71/17/01Computer Pipelines•Execute billions of instructions, so throughput is what matters•What is desirable in instruction sets for pipelining?–Variable length instructions vs. all instructions same length?–Memory operands part of any operation vs. memory operands only in loads or stores?–Register operand many places in instruction format vs. registers located in same place?CS252/PattersonLec 1.81/17/01A "Typical" RISC•32-bit fixed format instruction (3 formats)•Memory access only via load/store instrutions•32 32-bit GPR (R0 contains zero, DP take pair)•3-address, reg-reg arithmetic instruction; registers in same place•Single address mode for load/store: base + displacement–no indirection•Simple branch conditions•Delayed branchsee: SPARC, MIPS, HP PA-Risc, DEC Alpha, IBM PowerPC, CDC 6600, CDC 7600, Cray-1, Cray-2, Cray-3CS252/PattersonLec 1.91/17/01Example: MIPS (Note register location)Op31 26 01516202125Rs1 RdimmediateOp31 26 025Op31 26 01516202125Rs1 Rs2targetRd OpxRegister-Register561011Register-ImmediateOp31 26 01516202125Rs1 Rs2/OpximmediateBranchJump / CallCS252/PattersonLec 1.101/17/015 Steps of MIPS DatapathFigure 3.1, Page 130, CA:AQA 2eMemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcLMDALUMUXMemoryReg FileMUX MUXDataMemoryMUXSignExtend4AdderZero?Next SEQ PCAddressNext PCWB DataInstRDRS1RS2ImmCS252/PattersonLec 1.111/17/015 Steps of MIPS DatapathFigure 3.4, Page 134 , CA:AQA 2eMemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcALUMemoryReg FileMUX MUXDataMemoryMUXSignExtendZero?IF/IDID/EXMEM/WBEX/MEM4AdderNext SEQ PCNext SEQ PCRD RD RDWB Data• Data stationary control– local decode for each instruction phase / pipeline stageNext PCAddressRS1RS2ImmMUXCS252/PattersonLec 1.121/17/01Visualizing PipeliningFigure 3.3, Page 133 , CA:AQA 2eInstr.OrderTime (clock cycles)RegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegRegALUDMemIfetchRegCycle 1Cycle 2 Cycle 3 Cycle 4 Cycle 6Cycle 7Cycle 5CS252/PattersonLec 1.131/17/01Its Not That Easy for Computers•Limits to pipelining: Hazards prevent next


View Full Document

Berkeley COMPSCI 252 - Lecture Notes

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?