Unformatted text preview:

CS61C Machine Structures Lecture 33 Caches III 4 14 2006 John Wawrzynek www cs berkeley edu johnw www inst eecs berkeley edu cs61c CS 61C L33 Caches III 1 Wawrzynek Spring 2006 UCB Review Why We Use Caches CPU Moore s Law 100 1999 2000 1998 1996 1994 1995 1993 1992 1991 1989 1990 1988 1986 1987 1985 1984 1982 1983 1981 1980 10 1 Proc 60 yr Processor Memory Performance Gap grows 50 year DRAM DRAM 7 yr 1997 Performance 1000 1989 first Intel CPU with cache on chip 1998 Pentium III has two levels of cache on chip CS 61C L33 Caches III 2 Wawrzynek Spring 2006 UCB Fully Associative Cache 2 3 Fully Associative Cache e g 32 B block compare tags in parallel 4 0 Byte Offset Cache Tag 27 bits long Cache Tag Valid Cache Data B 31 31 B1 B0 CS 61C L33 Caches III 3 Wawrzynek Spring 2006 UCB Fully Associative Cache 3 3 Benefit of Fully Assoc Cache No Conflict Misses since data can go anywhere Drawbacks of Fully Assoc Cache Need hardware comparator for every single entry if we have a 64KB of data in cache with 4B entries we need 16K comparators very expensive Alternatively use fewer comparisons but compare sequentially too slow CS 61C L33 Caches III 4 Wawrzynek Spring 2006 UCB Third Type of Cache Miss Capacity Misses miss that occurs because the cache has a limited size miss that would not occur if we increase the size of the cache sketchy definition so just get the general idea This is the primary type of miss for Fully Associative caches CS 61C L33 Caches III 5 Wawrzynek Spring 2006 UCB N Way Set Associative Cache 1 4 Memory address fields Tag same as before Offset same as before Index points us to the correct row called a set in this case So what s the difference each set contains multiple blocks once we ve found correct set must compare with all tags in that set to find our data CS 61C L33 Caches III 6 Wawrzynek Spring 2006 UCB Set Associative Cache Example Memory Address Memory 0 1 2 3 4 5 6 7 8 9 A B C D E F Cache Index 0 0 1 1 Here s a simple 2 way set associative cache CS 61C L33 Caches III 7 Wawrzynek Spring 2006 UCB N Way Set Associative Cache 2 4 Summary cache is direct mapped w respect to sets each set is fully associative basically N direct mapped caches working in parallel each has its own valid bit and data CS 61C L33 Caches III 8 Wawrzynek Spring 2006 UCB N Way Set Associative Cache 3 4 Given memory address Find correct set using Index value Compare Tag with all Tag values in the determined set If a match occurs hit otherwise a miss Finally use the offset field as usual to find the desired data within the block CS 61C L33 Caches III 9 Wawrzynek Spring 2006 UCB N Way Set Associative Cache 4 4 What s so great about this even a 2 way set assoc cache avoids a lot of conflict misses hardware cost isn t that bad only need N comparators In fact for a cache with M blocks it s Direct Mapped if it s 1 way set assoc it s Fully Assoc if it s M way set assoc so these two are just special cases of the more general set associative design CS 61C L33 Caches III 10 Wawrzynek Spring 2006 UCB 4 Way Set Associative Cache Circuit tag index CS 61C L33 Caches III 11 Wawrzynek Spring 2006 UCB Block Replacement Policy 2 2 If there are any locations with valid bit off empty then usually write the new block into the first one If all possible locations already have a valid block we must pick a replacement policy rule by which we determine which block gets cached out on a miss CS 61C L33 Caches III 12 Wawrzynek Spring 2006 UCB Block Replacement Policy LRU LRU Least Recently Used Idea cache out block which has been accessed read or write least recently Pro temporal locality recent past use implies likely future use in fact this is a very effective policy Con with 2 way set assoc easy to keep track one LRU bit with 4 way or greater requires complicated hardware and much time to keep track of this CS 61C L33 Caches III 13 Wawrzynek Spring 2006 UCB Block Replacement Example We have a 2 way set associative cache with a four word total capacity and one word blocks We perform the following word accesses ignore bytes for this problem 0 2 0 1 4 0 2 3 5 4 How many hits and how many misses will there be for the LRU block replacement policy CS 61C L33 Caches III 14 Wawrzynek Spring 2006 UCB Block Replacement Example LRU loc 0 loc 1 set 0 0 lru Addresses 0 2 0 1 4 0 0 miss bring into set 0 loc 0 2 miss bring into set 0 loc 1 0 hit 1 miss bring into set 1 loc 0 4 miss bring into set 0 loc 1 replace 2 set 1 set 0 lru 0 2 set 1 set 0 lru 0 lru2 set 1 set 0 set 1 0 1 lru 2 lru set 0 lru0 lru4 2 set 1 1 lru set 0 0 0 hit set 1 1 CS 61C L33 Caches III 15 lru lru lru 4 lru Wawrzynek Spring 2006 UCB Big Idea How to choose between associativity block size replacement policy Design against a performance model Minimize Average Memory Access Time Hit Time Miss Penalty x Miss Rate influenced by technology program behavior Create the illusion of a memory that is large cheap and fast on average CS 61C L33 Caches III 16 Wawrzynek Spring 2006 UCB Example Assume Hit Time 1 cycle Miss rate 5 Miss penalty 20 cycles Calculate AMAT Avg mem access time 1 0 05 x 20 1 1 cycles 2 cycles CS 61C L33 Caches III 17 Wawrzynek Spring 2006 UCB Administrivia Do your reading VM is coming up and it s shown to be hard for students Project 5 out Exam Wed 4 19 1 Pimentel 7 9pm Covers weeks 6 12 focus on lecture material TA Review Monday evening CS 61C L33 Caches III 18 Wawrzynek Spring 2006 UCB Ways to reduce miss rate Larger cache limited by cost and technology hit time of first level cache cycle time More places in the cache to put each block of memory associativity fully associative any block any line N way set associated N places for each block direct map N 1 CS 61C L33 Caches III 19 Wawrzynek Spring 2006 UCB Improving Miss Penalty When caches first became popular Miss Penalty 10 processor clock cycles Today 2400 MHz Processor 0 4 ns per clock cycle and 80 ns to go to DRAM 200 processor clock cycles MEM 2 DRAM Proc Solution another cache between memory and the processor cache Second Level L2 Cache CS 61C L33 Caches III 20 Wawrzynek …


View Full Document

Berkeley COMPSCI 61C - Lecture Notes

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?