Berkeley COMPSCI 152 - Lecture 22 Advanced Caching - D543909

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 152> Lecture 22 Advanced Caching

Berkeley COMPSCI 152 - Lecture 22 Advanced Caching

School name University of California, Berkeley

Course Compsci 152- Computer Architecture and Engineering

Pages 9

Download Save

Unformatted text preview:

Recap Set Associative Cache N way set associative N entries for each Cache Index N direct mapped caches operates in parallel CS152 Computer Architecture and Engineering Lecture 22 Example Two way set associative cache Cache Index selects a set from the cache The two tags in the set are compared to the input in parallel Data is selected based on the tag result Advanced Caching Valid April 23 2003 John Kubiatowicz www cs berkeley edu kubitron lecture slides http inst eecs berkeley edu cs152 Cache Tag Adr Tag Cache Data Cache Index Cache Data Cache Block 0 Cache Block 0 Compare Sel1 1 Mux 0 Sel0 Cache Tag Valid Compare OR Hit 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 1 Cache Block UCB Spring 2003 4 23 03 CS152 Kubiatowicz Lec22 2 Recap A Summary on Sources of Cache Misses Recap Cache Performance Execution Time Instruction Count x Cycle Time x ideal CPI Memory Stalls Inst Other Stalls Inst Compulsory cold start or process migration first reference first access to a block Memory Stalls Inst Instruction Miss Rate x Instruction Miss Penalty Loads Inst x Load Miss Rate x Load Miss Penalty Stores Inst x Store Miss Rate x Store Miss Penalty Conflict collision Cold fact of life not a whole lot you can do about it Note If you are going to run billions of instruction Compulsory Misses are insignificant Multiple memory locations mapped to the same cache location Solution 1 increase cache size Solution 2 increase associativity Average Memory Access time AMAT Hit TimeL1 Miss RateL1 x Miss PenaltyL1 Hit RateL1 x Hit TimeL1 Miss RateL1 x Miss TimeL1 Capacity Cache cannot contain all blocks access by the program Solution increase cache size Coherence Invalidation other process e g I O updates memory 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 3 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 4 How Do you Design a Memory System The Big Picture Where are We Now Set of Operations that must be supported The Five Classic Components of a Computer read data Mem Physical Address write Mem Physical Address Data Processor Input Control Memory Physical Address Read Write Datapath Output Data Determine the internal register transfers Design the Datapath Design the Cache Controller Today s Topics Recap last lecture Virtual Memory Protection TLB Buses 4 23 03 Address Control Cache Points Cache DataPath Controller Data In Data Out CS152 Kubiatowicz Lec22 5 UCB Spring 2003 4 23 03 wait Signals UCB Spring 2003 R W Active CS152 Kubiatowicz Lec22 6 Improving Cache Performance 3 general options Impact on Cycle Time PC Cache Hit Time directly tied to clock rate increases with cache size increases with associativity Memory Black Box Inside it has Tag Data Storage Muxes Comparators Average Memory Access time Hit Time Miss Rate x Miss Penalty I Cache miss IR Hit Rate x Hit Time Miss Rate x Miss Time IRex A B invalid IRm Average Memory Access time Hit Time Miss Rate x Miss Penalty R T IRwb Time IC x CT x ideal CPI memory stalls 4 23 03 UCB Spring 2003 Options to reduce AMAT 1 Reduce the miss rate 2 Reduce the miss penalty or 3 Reduce the time to hit in the cache D Cache Miss CS152 Kubiatowicz Lec22 7 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 8 Improving Cache Performance 3Cs Absolute Miss Rate SPEC92 1 Reduce the miss rate 2 Reduce the miss penalty or 3 Reduce the time to hit in the cache 0 14 1 way Conflict 0 12 2 way 0 1 4 way 0 08 8 way 0 06 Capacity 0 04 0 02 Compulsory Cache Size KB CS152 Kubiatowicz Lec22 9 UCB Spring 2003 4 23 03 3Cs Relative Miss Rate 2 1 Cache Rule miss rate 1 way associative cache size X miss rate 2 way associative cache size X 2 0 14 CS152 Kubiatowicz Lec22 10 UCB Spring 2003 4 23 03 128 64 32 16 8 4 2 1 0 100 1 way 80 1 way Conflict 0 12 2 way Conflict 2 way 4 way 8 way 60 0 1 4 way 40 0 08 Capacity 8 way 0 06 Capacity 20 0 04 4 23 03 UCB Spring 2003 128 64 Cache Size KB 128 64 32 16 8 4 1 Cache Size KB 32 16 8 4 2 1 0 2 0 0 02 Compulsory Compulsory CS152 Kubiatowicz Lec22 11 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 12 1 Reduce Misses via Larger Block Size 2 Reduce Misses via Higher Associativity 2 1 Cache Rule Miss Rate DM cache size N Miss Rate 2 way cache size N 2 25 Beware Execution time is only final measure Miss Rate Will Clock Cycle time increase Hill 1988 suggested hit time for 2 way vs 1 way external cache 10 internal 2 1K 20 4K 15 16K 10 64K 5 256K 256 128 64 32 16 0 Block Size bytes UCB Spring 2003 4 23 03 CS152 Kubiatowicz Lec22 13 4 23 03 Example Avg Memory Access Time vs Miss Rate Assume CCT 1 10 for 2 way 1 12 for 4 way 1 14 for 8way vs CCT direct mapped Cache Size KB 1 way 1 2 33 2 1 98 4 1 72 8 1 46 16 1 29 32 1 20 64 1 14 128 1 10 Associativity 2 way 4 way 2 15 2 07 1 86 1 76 1 67 1 61 1 48 1 47 1 32 1 32 1 24 1 25 1 20 1 21 1 17 1 18 8 way 2 01 1 68 1 53 1 43 1 32 1 27 1 23 1 20 CS152 Kubiatowicz Lec22 14 UCB Spring 2003 3 Reducing Misses via a Victim Cache How to combine fast hit time of direct mapped yet still avoid conflict misses Add buffer to place data discarded from cache Jouppi 1990 4 entry victim cache removed 20 to 95 of conflicts for a 4 KB direct mapped data cache Used in Alpha HP machines TAGS DATA Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data To Next Lower Level In Hierarchy Red means A M A T not improved by more associativity 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 15 4 23 03 UCB Spring 2003 CS152 Kubiatowicz Lec22 16 4 Reducing Misses by Hardware Prefetching 5 Reducing Misses by Software Prefetching Data E g Instruction Prefetching Data Prefetch Alpha 21064 fetches 2 blocks on a miss Extra block placed in stream buffer On miss check stream buffer Load data into register HP PA RISC loads Cache Prefetch load into cache MIPS IV PowerPC SPARC v 9 Special prefetching instructions cannot cause faults a form of speculative execution Works with data blocks too Jouppi 1990 1 data stream buffer got 25 misses from 4KB cache 4 streams got 43 Palacharla Kessler 1994 for scientific programs for 8 streams got 50 to 70 of misses from 2 …

View Full Document

Berkeley COMPSCI 152 - Lecture 22 Advanced Caching

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

Berkeley COMPSCI 152 - Lecture 22 Advanced Caching

Sign up for free to view:

Please select your school