DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 22 Advanced Caching

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 22 Advanced Caching April 23 2003 John Kubiatowicz www cs berkeley edu kubitron lecture slides http inst eecs berkeley edu cs152 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Recap Set Associative Cache N way set associative N entries for each Cache Index N direct mapped caches operates in parallel Example Two way set associative cache Cache Index selects a set from the cache The two tags in the set are compared to the input in parallel Data is selected based on the tag result Valid Cache Tag Adr Tag Compare Cache Index Cache Data Cache Data Cache Block 0 Cache Block 0 Sel1 1 Mux 0 Sel0 Cache Tag Valid Compare OR Hit 4 23 03 Cache Block UCB Spring 2003 CS152 Kubiatowicz Recap Cache Performance Execution Time Instruction Count x Cycle Time x ideal CPI Memory Stalls Inst Other Stalls Inst Memory Stalls Inst Instruction Miss Rate x Instruction Miss Penalty Loads Inst x Load Miss Rate x Load Miss Penalty Stores Inst x Store Miss Rate x Store Miss Penalty Average Memory Access time AMAT Hit TimeL1 Miss RateL1 x Miss PenaltyL1 Hit RateL1 x Hit TimeL1 Miss RateL1 x Miss TimeL1 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Recap A Summary on Sources of Cache Misses Compulsory cold start or process migration first reference first access to a block Cold fact of life not a whole lot you can do about it Note If you are going to run billions of instruction Compulsory Misses are insignificant Conflict collision Multiple memory locations mapped to the same cache location Solution 1 increase cache size Solution 2 increase associativity Capacity Cache cannot contain all blocks access by the program Solution increase cache size Coherence Invalidation other process e g I O updates memory CS152 Kubiatowicz 4 23 03 UCB Spring 2003 The Big Picture Where are We Now The Five Classic Components of a Computer Processor Input Control Memory Datapath Output Today s Topics 4 23 03 Recap last lecture Virtual Memory Protection TLB Buses UCB Spring 2003 CS152 Kubiatowicz How Do you Design a Memory System Set of Operations that must be supported read data Mem Physical Address write Mem Physical Address Data Physical Address Read Write Memory Black Box Inside it has Tag Data Storage Muxes Comparators Data Determine the internal register transfers Design the Datapath Design the Cache Controller Address Control Cache Points Cache DataPath Controller Data In Data Out 4 23 03 UCB Spring 2003 Signals R W Active wait CS152 Kubiatowicz Impact on Cycle Time PC Cache Hit Time directly tied to clock rate increases with cache size increases with associativity I Cache miss IR IRex A B invalid Average Memory Access time Hit Time Miss Rate x Miss Penalty Time IC x CT x ideal CPI memory stalls 4 23 03 UCB Spring 2003 IRm R D Cache IRwb T Miss CS152 Kubiatowicz Improving Cache Performance 3 general options Average Memory Access time Hit Time Miss Rate x Miss Penalty Hit Rate x Hit Time Miss Rate x Miss Time Options to reduce AMAT 1 Reduce the miss rate 2 Reduce the miss penalty or 3 Reduce the time to hit in the cache 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Improving Cache Performance 1 Reduce the miss rate 2 Reduce the miss penalty or 3 Reduce the time to hit in the cache 4 23 03 UCB Spring 2003 CS152 Kubiatowicz 3Cs Absolute Miss Rate SPEC92 Miss Rate per Type 0 14 1 way 0 12 Conflict 2 way 0 1 4 way 0 08 8 way 0 06 Capacity 0 04 0 02 Cache Size KB 4 23 03 UCB Spring 2003 128 64 32 16 8 4 2 1 0 Compulsory CS152 Kubiatowicz 2 1 Cache Rule miss rate 1 way associative cache size X miss rate 2 way associative cache size X 2 Miss Rate per Type 0 14 1 way 0 12 Conflict 2 way 0 1 4 way 0 08 8 way 0 06 Capacity 0 04 0 02 Cache Size KB 4 23 03 UCB Spring 2003 128 64 32 16 8 4 2 1 0 Compulsory CS152 Kubiatowicz 3Cs Relative Miss Rate Miss Rate per Type 100 1 way 80 Conflict 2 way 4 way 8 way 60 40 Capacity 20 Cache Size KB 4 23 03 UCB Spring 2003 128 64 32 16 8 4 2 1 0 Compulsory CS152 Kubiatowicz 1 Reduce Misses via Larger Block Size 25 1K 20 Miss Rate 4K 15 16K 10 64K 5 256K 256 128 64 32 16 0 Block Size bytes 4 23 03 UCB Spring 2003 CS152 Kubiatowicz 2 Reduce Misses via Higher Associativity 2 1 Cache Rule Miss Rate DM cache size N Miss Rate 2 way cache size N 2 Beware Execution time is only final measure Will Clock Cycle time increase Hill 1988 suggested hit time for 2 way vs 1 way external cache 10 internal 2 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Example Avg Memory Access Time vs Miss Rate Assume CCT 1 10 for 2 way 1 12 for 4 way 1 14 for 8 way vs CCT direct mapped Cache Size Associativity KB 1 way 2 way 4 way 8 way 1 2 33 2 15 2 07 2 01 2 1 98 1 86 1 76 1 68 4 1 72 1 67 1 61 1 53 8 1 46 1 48 1 47 1 43 16 1 29 1 32 1 32 1 32 32 1 20 1 24 1 25 1 27 64 1 14 1 20 1 21 1 23 128 1 10 1 17 1 18 1 20 Red means A M A T not improved by more associativity 4 23 03 UCB Spring 2003 CS152 Kubiatowicz 3 Reducing Misses via a Victim Cache How to combine fast hit time of direct mapped yet still avoid conflict misses Add buffer to place data discarded from cache Jouppi 1990 4 entry victim cache removed 20 to 95 of conflicts for a 4 KB direct mapped data cache Used in Alpha HP machines TAGS DATA Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data To Next Lower Level In Hierarchy 4 23 03 UCB Spring 2003 CS152 Kubiatowicz 4 Reducing Misses by Hardware Prefetching E g Instruction Prefetching Alpha 21064 fetches 2 blocks on a miss Extra block placed in stream buffer On miss check stream buffer Works with data blocks too Jouppi 1990 1 data stream buffer got 25 misses from 4KB cache 4 streams got 43 Palacharla Kessler 1994 for scientific programs for 8 streams got 50 to 70 of misses from 2 64KB 4 way set associative caches Prefetching relies on having extra memory bandwidth that can be used without penalty Could reduce performance if done indiscriminantly 4 23 03 UCB Spring 2003 CS152 Kubiatowicz 5 Reducing Misses by Software Prefetching Data Data Prefetch Load data into register HP PA RISC loads Cache Prefetch load into cache MIPS IV PowerPC …


View Full Document

Berkeley COMPSCI 152 - Lecture 22 Advanced Caching

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 22 Advanced Caching and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 22 Advanced Caching and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?