DOC PREVIEW
CMU CS 15213 - Cache Performance

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Intel Pentium III Cache Hierarchy 15 213 The course that gives CMU its Zip Cache Performance October 3 2007 Regs Topics L1 Data 1 cycle latency 16 KB 4 way assoc Write through 32B lines L1 Instruction 16 KB 4 way 32B lines Impact of caches on performance The memory mountain L2 L2Unified Unified 128KB 2 128KB 2MB MB 4 way 4 wayassoc assoc Write back Write back Write Writeallocate allocate 32B 32Blines lines Main Main Memory Memory Up Upto to4GB 4GB Processor Processor Chip Chip Cache Performance Metrics Writing Cache Friendly Code Miss Rate Repeated references to variables are good temporal locality locality Fraction of memory references not found in cache misses references Typical numbers z 3 10 for L1 z can be quite small e g 1 for L2 depending on size etc StrideStride 1 reference patterns are good spatial spatial locality locality Examples cold Hit Time Time to deliver a line in the cache to the processor includes time to determine whether the line is in the cache Typical numbers Aside for architects z 1 2 clock cycle for L1 Increasing cache size z 5 20 clock cycles for L2 Miss Penalty 3 15 213 F 07 2 class11 ppt int sum array cols int a M N int i j sum 0 for i 0 i M i for j 0 j N j sum a i j return sum for j 0 j N j for i 0 i M i sum a i j return sum Increasing block size Increasing associativity Additional time required because of a miss z Typically 50 200 cycles for main memory Trend increasing cache 4 byte words 4 word cache blocks int sum array rows int a M N int i j sum 0 15 213 F 07 4 Page 1 Miss rate 1 4 25 Miss rate 100 15 213 F 07 The Memory Mountain Memory Mountain Test Function Read throughput read bandwidth The test function void test int elems int stride int i result 0 volatile int sink Number of bytes read from memory per second MB s Memory mountain Measured read throughput as a function of spatial and temporal locality Compact way to characterize memory system performance for i 0 i elems i stride result data i sink result So compiler doesn t optimize away the loop Run test elems stride and return read throughput MB s double run int size int stride double Mhz double cycles int elems size sizeof int test elems stride warm up the cache cycles fcyc2 test elems stride 0 call test elems stride return size stride cycles Mhz convert cycles to MB s 15 213 F 07 400 xe L2 15 213 F 07 8 Page 2 8k 32k 128k 512k Stride words s9 s5 s7 s1 mem 2k 200 0 7 600 2m Slopes of Spatial Locality init data data MAXELEMS Initialize each element in data to 1 Mhz mhz 0 Estimate the clock frequency for size MAXBYTES size MINBYTES size 1 for stride 1 stride MAXSTRIDE stride printf 1f t run size stride Mhz printf n exit 0 L1 800 8m Working set size in bytes Stride in array elements Clock frequency 1000 s13 The array we ll be traversing 1200 s15 Throughput MB sec mountain c Generate the memory mountain define MINBYTES 1 10 Working set size ranges from 1 KB define MAXBYTES 1 23 up to 8 MB define MAXSTRIDE 16 Strides range from 1 to 16 define MAXELEMS MAXBYTES sizeof int int main int size int stride double Mhz Pentium III 550 MHz 16 KB on chip L1 d cache 16 KB on chip L1 i cache 512 KB off chip unified L2 cache The Memory Mountain s11 Memory Mountain Main Routine int data MAXELEMS 15 213 F 07 6 s3 5 Ridges of Temporal Locality Working set size bytes 15 213 F 07 X86 64 Memory Mountain Pentium Nocona Xeon x86 64 3 2 GHz 12 Kuop on chip L1 trace cache 16 KB on chip L1 d cache 1 MB off chip unified L2 cache Read Throughput MB s 6000 Slopes of Spatial Locality Opteron Memory Mountain 5000 3000 AMD Opteron 2 GHZ L1 2500 L1 2000 4000 3000 Read throughput MB s L2 4k 1000 L2 1000 Ridges of Temporal Locality 2000 1500 4k 500 128k 128k 4m s29 s21 s17 Stride words s25 10 s9 s5 s1 15 213 F 07 s13 s25 Mem 0 128m s29 s17 s21 s9 s13 Mem Stride w ords 9 Working Set Size bytes 4m s5 s1 0 Working set bytes 128m Ridges of Temporal Locality A Slope of Spatial Locality Slice through the memory mountain with stride 1 Slice through memory mountain with size 256KB illuminates read throughputs of different caches and memory 15 213 F 07 shows cache block size 800 1200 main memory region L2 cache region L1 cache region 700 read throughput MB s 800 600 400 11 600 500 one access per cache line 400 300 200 working set size bytes s1 1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 2m 0 1024k 100 0 4m 200 8m read througput MB s 1000 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 stride words 15 213 F 07 12 Page 3 15 213 F 07 Matrix Multiplication Example Miss Rate Analysis for Matrix Multiply Major Cache Effects to Consider Assume Total cache size e g use blocking Block size z Exploit spatial locality Description Multiply N x N matrices O N3 total operations Accesses Line size 32B big enough for four 64 bit words Matrix dimension N is very large Cache is not even big enough to hold multiple rows z Exploit temporal locality and keep the working set small z Approximate 1 N as 0 0 Variable sum ijk ijk held in register for i 0 i n i for i 0 i n i for for j 0 j 0 j n j n j j sum sum 0 0 0 0 for for k 0 k 0 k n k n k k sum sum a i k a i k b k j b k j c i j c i j sum sum Analysis Method Look at access pattern of inner loop k i j j k A i C B z N reads per source element z N values summed per destination but may be able to hold in register 13 15 213 F 07 15 213 F 07 14 Layout of C Arrays in Memory review Matrix Multiplication ijk C arrays allocated in rowrow major order each row in contiguous memory locations ijk ijk for for i 0 i 0 i n i n i i for for j 0 j 0 j n j n j j sum sum 0 0 0 0 for for k 0 k 0 k n k n k k sum sum a i k a i k b k j b k j c i j c i j sum sum Stepping through columns in one row for i 0 i N i sum a 0 i accesses …


View Full Document

CMU CS 15213 - Cache Performance

Documents in this Course
lecture

lecture

14 pages

lecture

lecture

46 pages

Caches

Caches

9 pages

lecture

lecture

39 pages

Lecture

Lecture

36 pages

Lecture

Lecture

45 pages

Lecture

Lecture

56 pages

lecture

lecture

11 pages

lecture

lecture

9 pages

Lecture

Lecture

36 pages

Lecture

Lecture

37 pages

Exam

Exam

16 pages

Lecture

Lecture

10 pages

Lecture

Lecture

43 pages

Lecture

Lecture

8 pages

Lecture

Lecture

8 pages

Lecture

Lecture

36 pages

Lecture

Lecture

43 pages

Lecture

Lecture

12 pages

Lecture

Lecture

37 pages

Lecture

Lecture

6 pages

Lecture

Lecture

40 pages

coding

coding

2 pages

Exam

Exam

17 pages

Exam

Exam

14 pages

Lecture

Lecture

29 pages

Lecture

Lecture

34 pages

Exam

Exam

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

37 pages

Lecture

Lecture

36 pages

lecture

lecture

46 pages

Lecture

Lecture

33 pages

Lecture

Lecture

57 pages

Lecture

Lecture

32 pages

Lecture

Lecture

46 pages

Lecture

Lecture

40 pages

Lecture

Lecture

11 pages

Lecture

Lecture

6 pages

Lecture

Lecture

43 pages

Lecture

Lecture

12 pages

Lecture

Lecture

18 pages

Exam

Exam

10 pages

Lecture

Lecture

45 pages

Lecture

Lecture

37 pages

Exam

Exam

24 pages

class09

class09

21 pages

class22

class22

37 pages

class20

class20

30 pages

class27

class27

33 pages

class25

class25

21 pages

class04

class04

31 pages

Lecture

Lecture

59 pages

class01a

class01a

14 pages

class12

class12

45 pages

class29

class29

33 pages

Lecture

Lecture

39 pages

Lecture

Lecture

6 pages

class03

class03

34 pages

lecture

lecture

42 pages

Lecture

Lecture

40 pages

Lecture

Lecture

47 pages

Exam

Exam

19 pages

R06-B

R06-B

25 pages

class17

class17

37 pages

class25

class25

31 pages

Lecture

Lecture

15 pages

final-f06

final-f06

17 pages

Lecture

Lecture

9 pages

lecture

lecture

9 pages

Exam

Exam

15 pages

Lecture

Lecture

22 pages

class11

class11

45 pages

lecture

lecture

50 pages

Linking

Linking

37 pages

Lecture

Lecture

64 pages

Integers

Integers

40 pages

Exam

Exam

11 pages

Lecture

Lecture

37 pages

Lecture

Lecture

44 pages

Lecture

Lecture

37 pages

Lecture

Lecture

9 pages

Lecture

Lecture

37 pages

Lecture

Lecture

45 pages

Final

Final

25 pages

lecture

lecture

9 pages

Lecture

Lecture

30 pages

Lecture

Lecture

16 pages

Final

Final

17 pages

Lecture

Lecture

8 pages

Exam

Exam

11 pages

Lecture

Lecture

47 pages

Lecture

Lecture

9 pages

lecture

lecture

39 pages

Exam

Exam

11 pages

lecture

lecture

41 pages

lecture

lecture

37 pages

Lecture

Lecture

59 pages

Lecture

Lecture

45 pages

Exam 1

Exam 1

18 pages

Lecture

Lecture

41 pages

Lecture

Lecture

32 pages

Lecture

Lecture

30 pages

Lecture

Lecture

9 pages

Lecture

Lecture

9 pages

Lecture

Lecture

15 pages

Lecture

Lecture

11 pages

Lecture

Lecture

9 pages

Lecture

Lecture

34 pages

Lecture

Lecture

40 pages

Lecture

Lecture

4 pages

Lecture

Lecture

46 pages

Lecture

Lecture

8 pages

Lecture

Lecture

65 pages

Lecture

Lecture

38 pages

Lecture

Lecture

35 pages

Lecture

Lecture

8 pages

Lecture

Lecture

34 pages

Lecture

Lecture

8 pages

Exam

Exam

13 pages

Lecture

Lecture

43 pages

Lecture

Lecture

9 pages

Lecture

Lecture

12 pages

Lecture

Lecture

9 pages

Lecture

Lecture

34 pages

Lecture

Lecture

43 pages

Lecture

Lecture

7 pages

Lecture

Lecture

45 pages

Lecture

Lecture

24 pages

Lecture

Lecture

47 pages

Lecture

Lecture

12 pages

Lecture

Lecture

20 pages

Lecture

Lecture

9 pages

Exam

Exam

11 pages

Lecture

Lecture

52 pages

Lecture

Lecture

20 pages

Exam

Exam

11 pages

Lecture

Lecture

35 pages

Lecture

Lecture

47 pages

Lecture

Lecture

18 pages

Lecture

Lecture

30 pages

Lecture

Lecture

59 pages

Lecture

Lecture

37 pages

Lecture

Lecture

22 pages

Lecture

Lecture

35 pages

Exam

Exam

23 pages

Lecture

Lecture

9 pages

Lecture

Lecture

22 pages

class12

class12

32 pages

Lecture

Lecture

8 pages

Lecture

Lecture

39 pages

Lecture

Lecture

44 pages

Lecture

Lecture

38 pages

Lecture

Lecture

69 pages

Lecture

Lecture

41 pages

Lecture

Lecture

12 pages

Lecture

Lecture

52 pages

Lecture

Lecture

59 pages

Lecture

Lecture

39 pages

Lecture

Lecture

83 pages

Lecture

Lecture

59 pages

class01b

class01b

17 pages

Exam

Exam

21 pages

class07

class07

47 pages

Lecture

Lecture

11 pages

Odyssey

Odyssey

18 pages

multicore

multicore

66 pages

Lecture

Lecture

6 pages

lecture

lecture

41 pages

lecture

lecture

55 pages

lecture

lecture

52 pages

lecture

lecture

33 pages

lecture

lecture

46 pages

lecture

lecture

55 pages

lecture

lecture

17 pages

lecture

lecture

49 pages

Exam

Exam

17 pages

lecture

lecture

56 pages

Exam 2

Exam 2

16 pages

Exam 2

Exam 2

16 pages

Notes

Notes

37 pages

Lecture

Lecture

40 pages

Lecture

Lecture

36 pages

Lecture

Lecture

43 pages

Lecture

Lecture

25 pages

Exam

Exam

13 pages

Lecture

Lecture

32 pages

Lecture

Lecture

12 pages

Lecture

Lecture

58 pages

Lecture

Lecture

29 pages

Lecture

Lecture

59 pages

Lecture

Lecture

41 pages

Lecture

Lecture

50 pages

Exam

Exam

17 pages

Lecture

Lecture

29 pages

Lecture

Lecture

44 pages

Lecture

Lecture

41 pages

Lecture

Lecture

52 pages

Lecture

Lecture

40 pages

Lecture

Lecture

33 pages

lecture

lecture

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

29 pages

Lecture

Lecture

39 pages

Lecture

Lecture

9 pages

Lecture

Lecture

29 pages

Lecture

Lecture

8 pages

Lecture

Lecture

43 pages

Lecture

Lecture

43 pages

Lecture

Lecture

75 pages

Lecture

Lecture

55 pages

Exam

Exam

12 pages

Lecture

Lecture

43 pages

Lecture

Lecture

35 pages

lecture

lecture

36 pages

Exam

Exam

33 pages

lecture

lecture

56 pages

lecture

lecture

64 pages

lecture

lecture

8 pages

Exam

Exam

14 pages

Lecture

Lecture

43 pages

Lecture

Lecture

36 pages

lecture

lecture

56 pages

lecture

lecture

75 pages

lecture

lecture

36 pages

Lecture

Lecture

50 pages

Lecture

Lecture

45 pages

Lecture

Lecture

13 pages

Exam

Exam

23 pages

Lecture

Lecture

10 pages

Lecture

Lecture

48 pages

Lecture

Lecture

83 pages

lecture

lecture

57 pages

Lecture

Lecture

33 pages

Lecture

Lecture

39 pages

Lecture

Lecture

33 pages

lecture

lecture

54 pages

Lecture

Lecture

30 pages

Exam

Exam

13 pages

Lecture

Lecture

36 pages

Lecture

Lecture

40 pages

Exam

Exam

17 pages

Lecture

Lecture

9 pages

Exam

Exam

15 pages

Lecture

Lecture

44 pages

Lecture

Lecture

34 pages

Lecture

Lecture

24 pages

Lecture

Lecture

29 pages

class12

class12

43 pages

lecture

lecture

43 pages

class22

class22

22 pages

R06-B

R06-B

25 pages

class01b

class01b

19 pages

lecture

lecture

29 pages

lab1

lab1

8 pages

Caches

Caches

36 pages

lecture

lecture

55 pages

Lecture,

Lecture,

37 pages

Integers

Integers

40 pages

Linking

Linking

38 pages

lecture

lecture

45 pages

Lecture

Lecture

61 pages

Linking

Linking

33 pages

lecture

lecture

40 pages

lecture

lecture

40 pages

Lecture

Lecture

32 pages

lecture

lecture

48 pages

lecture

lecture

44 pages

Exam

Exam

11 pages

Lecture

Lecture

31 pages

Lecture

Lecture

46 pages

Lecture

Lecture

40 pages

Lecture

Lecture

40 pages

Exam

Exam

12 pages

Lecture

Lecture

42 pages

Lecture

Lecture

36 pages

Lecture

Lecture

45 pages

Lecture

Lecture

41 pages

Lecture

Lecture

13 pages

Lecture

Lecture

35 pages

Lecture

Lecture

20 pages

Final

Final

19 pages

Lecture

Lecture

33 pages

Lecture

Lecture

50 pages

Lecture

Lecture

33 pages

Lecture

Lecture

27 pages

Lecture

Lecture

6 pages

Exam

Exam

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

23 pages

Lecture

Lecture

43 pages

Lecture

Lecture

32 pages

Lecture

Lecture

52 pages

Lecture

Lecture

37 pages

Lecture

Lecture

36 pages

Lecture

Lecture

34 pages

Lecture

Lecture

40 pages

Lecture

Lecture

15 pages

lecture

lecture

21 pages

Lecture

Lecture

58 pages

Lecture

Lecture

49 pages

Lecture

Lecture

36 pages

Lecture

Lecture

11 pages

Lecture

Lecture

12 pages

Lecture

Lecture

58 pages

Lecture

Lecture

33 pages

Exam

Exam

15 pages

Lecture

Lecture

35 pages

Lecture

Lecture

10 pages

Lecture

Lecture

25 pages

Lecture

Lecture

31 pages

Lecture

Lecture

24 pages

Lecture

Lecture

34 pages

Lecture

Lecture

50 pages

lecture

lecture

35 pages

Lecture

Lecture

11 pages

Lecture

Lecture

39 pages

Lecture

Lecture

45 pages

Lecture

Lecture

41 pages

exam1-f05

exam1-f05

11 pages

Lecture

Lecture

4 pages

Lecture

Lecture

17 pages

Exam

Exam

17 pages

malloc()

malloc()

12 pages

Lecture

Lecture

57 pages

Lecture

Lecture

30 pages

Lecture

Lecture

30 pages

Lecture

Lecture

47 pages

Lecture

Lecture

33 pages

Exam

Exam

12 pages

Lecture

Lecture

43 pages

Lectures

Lectures

33 pages

Lecture

Lecture

36 pages

lecture

lecture

33 pages

Exam

Exam

14 pages

Lecture

Lecture

43 pages

Lecture

Lecture

25 pages

Load more
Download Cache Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Cache Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Cache Performance and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?