DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 20 Caches

This preview shows page 1-2-3-24-25-26-27-49-50-51 out of 51 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 51 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 20 CachesRecap: The Big Picture: Where are We Now?The Art of Memory System DesignRecap: Cache PerformanceExample: 1 KB Direct Mapped Cache with 32 B BlocksSet Associative CacheDisadvantage of Set Associative CacheExample: Fully AssociativeA Summary on Sources of Cache MissesDesign options at constant costFour Questions for Caches and Memory HierarchyQ1: Where can a block be placed in the upper level?Q2: How is a block found if it is in the upper level?Q3: Which block should be replaced on a miss?Q4: What happens on a write?New Question: How does a store to the cache work?Write Buffer for Write ThroughWrite-miss Policy: Write Allocate versus Not AllocateAdministrative IssuesAdministrivia: Edge Detection For Lab 5How Do you Design a Memory System?Review: Stall Methodology in Memory StageImpact on Cycle TimeImproving Cache Performance: 3 general optionsImproving Cache Performance3Cs Absolute Miss Rate (SPEC92)2:1 Cache Rule3Cs Relative Miss Rate1. Reduce Misses via Larger Block Size2. Reduce Misses via Higher AssociativityExample: Avg. Memory Access Time vs. Miss Rate3. Reducing Misses via a “Victim Cache”4. Reducing Misses by Hardware Prefetching5. Reducing Misses by Software Prefetching Data6. Reducing Misses by Compiler OptimizationsImproving Cache Performance (Continued)0. Reducing Penalty: Faster DRAM / Interface1. Reducing Penalty: Read Priority over Write on MissWrite Buffer SaturationRAW Hazards from Write Buffer!2. Reduce Penalty: Early Restart and Critical Word First3. Reduce Penalty: Non-blocking CachesReprise: What happens on a Cache miss?Value of Hit Under Miss for SPEC4. Reduce Penalty: Second-Level CacheReducing Misses: which apply to L2 Cache?L2 cache block size & A.M.A.T.Slide 48Example: Harvard ArchitectureSummary #1/ 2:Summary #2 / 2: The Cache Design SpaceCS152Computer Architecture and EngineeringLecture 20CachesApril 14, 2003John Kubiatowicz (www.cs.berkeley.edu/~kubitron)lecture slides: http://inst.eecs.berkeley.edu/~cs152/4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.2°The Five Classic Components of a Computer°Today’s Topics: •Recap last lecture•Simple caching techniques•Many ways to improve cache performance•Virtual memory?Recap: The Big Picture: Where are We Now? ControlDatapathMemoryProcessorInputOutput4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.3Processor$MEMMemoryreference stream <op,addr>, <op,addr>,<op,addr>,<op,addr>, . . .op: i-fetch, read, writeOptimize the memory system organizationto minimize the average memory access timefor typical workloadsWorkload orBenchmarkprogramsThe Art of Memory System Design4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.4Execution_Time = Instruction_Count x Cycle_Time x (ideal CPI + Memory_Stalls/Inst + Other_Stalls/Inst)Memory_Stalls/Inst = Instruction Miss Rate x Instruction Miss Penalty +Loads/Inst x Load Miss Rate x Load Miss Penalty +Stores/Inst x Store Miss Rate x Store Miss PenaltyAverage Memory Access time (AMAT) = Hit TimeL1 + (Miss RateL1 x Miss PenaltyL1) =(Hit RateL1 x Hit TimeL1) + (Miss RateL1 x Miss TimeL1)Recap: Cache Performance4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.5Example: 1 KB Direct Mapped Cache with 32 B Blocks°For a 2 ** N byte cache:•The uppermost (32 - N) bits are always the Cache Tag•The lowest M bits are the Byte Select (Block Size = 2M)•One cache miss, pull in complete “Cache Block” (or “Cache Line”)Cache Index0123: Cache DataByte 00431:Cache Tag Example: 0x50Ex: 0x010x50Stored as partof the cache “state”Valid Bit:31Byte 1Byte 31:Byte 32Byte 33Byte 63:Byte 992Byte 1023: Cache TagByte SelectEx: 0x009Block address4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.6Set Associative Cache°N-way set associative: N entries for each Cache Index•N direct mapped caches operates in parallel°Example: Two-way set associative cache•Cache Index selects a “set” from the cache•The two tags in the set are compared to the input in parallel•Data is selected based on the tag resultCache DataCache Block 0Cache TagValid:: :Cache DataCache Block 0Cache Tag Valid: ::Cache IndexMux01Sel1 Sel0Cache BlockCompareAdr TagCompareORHit4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.7Disadvantage of Set Associative Cache°N-way Set Associative Cache versus Direct Mapped Cache:•N comparators vs. 1•Extra MUX delay for the data•Data comes AFTER Hit/Miss decision and set selection°In a direct mapped cache, Cache Block is available BEFORE Hit/Miss:•Possible to assume a hit and continue. Recover later if miss.Cache DataCache Block 0Cache Tag Valid: ::Cache DataCache Block 0Cache TagValid:: :Cache IndexMux01Sel1 Sel0Cache BlockCompareAdr TagCompareORHit4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.8Example: Fully Associative°Fully Associative Cache•Forget about the Cache Index•Compare the Cache Tags of all cache entries in parallel•Example: Block Size = 32 B blocks, we need N 27-bit comparators°By definition: Conflict Miss = 0 for a fully associative cache: Cache DataByte 00431:Cache Tag (27 bits long)Valid Bit:Byte 1Byte 31:Byte 32Byte 33Byte 63: Cache TagByte SelectEx: 0x01=====4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.9°Compulsory (cold start or process migration, first reference): first access to a block•“Cold” fact of life: not a whole lot you can do about it•Note: If you are going to run “billions” of instruction, Compulsory Misses are insignificant°Capacity:•Cache cannot contain all blocks access by the program•Solution: increase cache size°Conflict (collision):•Multiple memory locations mappedto the same cache location•Solution 1: increase cache size•Solution 2: increase associativity°Coherence (Invalidation): other process (e.g., I/O) updates memory A Summary on Sources of Cache Misses4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.10Design options at constant costDirect Mapped N-way Set Associative Fully AssociativeCompulsory MissCache SizeCapacity MissCoherence MissBig Medium SmallNote:If you are going to run “billions” of instruction, Compulsory Misses are insignificant (except for streaming media types of programs).SameSame SameConflict Miss High Medium ZeroLow Medium HighSame Same Same4/14/04 ©UCB Spring 2004CS152 / Kubiatowicz Lec20.11°Q1: Where can a block be placed in the upper level? (Block placement)°Q2: How is a block found if it is in the upper level? (Block identification)°Q3:


View Full Document

Berkeley COMPSCI 152 - Lecture 20 Caches

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 20 Caches
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 20 Caches and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 20 Caches 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?