DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 14 – Cache II

This preview shows page 1-2-3-4-5-33-34-35-36-67-68-69-70-71 out of 71 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46Slide 47Slide 48Slide 49Slide 50Slide 51Slide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59Slide 60Slide 61Slide 62Slide 63Slide 64Slide 65Slide 66Slide 67Slide 68Slide 69Slide 70Slide 71CS 152 L14: Cache II UC Regents Fall 2006 © UCB2006-10-17John Lazzaro (www.cs.berkeley.edu/~lazzaro)CS 152 Computer Architecture and EngineeringLecture 14 – Cache IIwww-inst.eecs.berkeley.edu/~cs152/And also, fnal project introductionAnd also, fnal project introductionTAs: Udam Saini and Jue SunUC Regents Fall 2006 © UCBCS 152 L14: Cache IILast Time: Locality encourages cachingDonald J. Hatfield, Jeanette Gerald: Program Restructuring for Virtual Memory. IBM Systems Journal 10(3): 168-192 (1971)TimeMemory Address (one dot per access)SpatialLocalityTemporal LocalityBadUC Regents Fall 2006 © UCBCS 152 L14: Cache IIToday ... Caches ReloadedCache misses and performance: how do we size the cache?Practical cache design: a state machine and a controller.The cache-DRAM interfaceWrite buffers and cachesAnd also, fnal project introductionAnd also, fnal project introductionUC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: Color-coding main memoryBlock #71234560227- 1...32-byte blocksBlocks of a certain color may only appear in one line of the cache.32-bit Memory AddressWhich block?ColorByte #031456725 bits 2 bits5 bitsCache indexUC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: A Direct Mapped CacheCache Tag (25 bits) Index Byte Select531 04=HitEx: 0x01Return bytes of “hit” cache lineEx: 0x00PowerPC 970: 64K direct-mapped Level-1 I-cache67ValidBitByte 31...Byte 1Byte 0Byte 31...Byte 1Byte 0Cache Tags024Cache DataUC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: Set Associative CacheCache Tag (26 bits)Index(2 bits)Byte Select (4 bits)Cache block halved to keep # cached bits constant.ValidCache BlockCache BlockCache Tags Cache DataCache BlockCache BlockCache TagsValidCache DataEx: 0x01=HitRight=HitLeftReturn bytes of “hit” set member“N-way” set associative -- N is number of blocks for each color16 bytes16 bytesPowerPC 970: 32K 2-wayset associative L1 D-cacheCS 152 L14: Cache II UC Regents Fall 2006 © UCBCache MissesPerformance&UC Regents Fall 2006 © UCBCS 152 L14: Cache II Recall: Performance EquationSecondsProgram InstructionsProgram=SecondsCycle InstructionCyclesAssumes a constant memory access time.True CPI depends on theAverage Memory Access Time (AMAT) for Inst & DataAMAT = Hit Time + (Miss Rate x Miss Penalty) Machine CPIEarlier, computed from ...Goal: Reduce AMATBeware! Improving one term may hurt other terms, and increase AMAT!True CPI = Ideal CPI +Memory Stall Cycles.See Section 7.3, COD/3e for details.UC Regents Fall 2006 © UCBCS 152 L14: Cache II One type of cache miss: Conflict Miss N blocks of same color in use at once, but cache can only hold M < N of them Solution: Increase M(Associativity)fully-associativeMissRateCache Size (KB)Miss rate improvementequivalent to doublingcache size. Other SolutionsIncrease number of cache lines (# blocks in cache)Q. Why does this help?Add a small “victim cache” that holds blocks recently removed from the cache.Q. Why does this help?AMAT = Hit Time + (Miss Rate x Miss Penalty)If hit time increases, AMAT may go up!UC Regents Fall 2006 © UCBCS 152 L14: Cache II Other causes of cache misses ...Solution: Prefetch blocks(via hardware, software) Capacity Misses Cache cannot contain all blocks accessed by the programSolution: Increase size of the cache Compulsory Misses First access of a block by a programMostly unavoidableMiss rates(absolute)Cache Size (KB)Miss rates (relative)Cache Size (KB)Also “Coherency Misses”: other processes update memoryUC Regents Fall 2006 © UCBCS 152 L14: Cache II Thinking about cache miss types ...What kind of misses happen in a fully associative cache of infinite size?A. Compulsory misses. Must bring each block into cache.In addition, what kind of misses happen in a finite-sized fully associative cache?A. Capacity misses. Program may use more blocks than can ft in cache.In addition, what kind of misses happen in a set-associative or direct-map cache?A. Conflict misses.(all questions assume the replacement policy used is considered “optimal”)CS 152 L14: Cache II UC Regents Fall 2006 © UCBPractical Cache DesignUC Regents Fall 2006 © UCBCS 152 L14: Cache II Cache Design: Datapath + ControlToCPUToLowerLevelMemoryToCPUToLowerLevelMemoryTagsBlocksAddrDinDoutAddrDinDoutState MachineControlControlControlDatapath for performance, control for correctness.Most design errors come from incorrect specifcation of state machine behavior!Red text will highlight state machine requirements ...UC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: State Machine Design ...Change == 1Change == 1 Change == 1R Y G1 0 0 R Y G0 0 1 R Y G0 1 0 Rst == 1Cache controller state machines like this, but more states, and perhaps several connected machines ...UC Regents Fall 2006 © UCBCS 152 L14: Cache IIIssue #1: Control for CPU interface ....Small, fast Large, slow FromCPUTo CPUFor reads,your state machine must: (1) sense REQ(2) latch Addr(3) create Wait(4) put Data Out on the bus.An example interface ... there are other possibilities.UC Regents Fall 2006 © UCBCS 152 L14: Cache II Issue #2: Cache Block Replacement After a cache read miss, if there are no empty cache blocks, which block should be removed from the cache?A randomly chosen block?Easy to implement, how well does it work?The Least Recently Used (LRU) block? Appealing,but hard to implement.Size Random LRU16 KB 5.7% 5.2%64 KB 2.0% 1.9%256 KB 1.17% 1.15%Miss Rate for 2-way Set Associative CacheAlso,tryotherLRUapprox.Part of your state machine decides which block to replace.UC Regents Fall 2006 © UCBCS 152 L14: Cache II4096 rows1 of 4096 decoder2048 columnsEach column 4 bits deep33,554,432 usable bits(tester found good bits in bigger array)12-bitrow address input8196 bits delivered by sense ampsSelect requested bits, send off the chipIssue #3: High performance block fetchWith proper memory layout, one row access delivers entire cache block to the sense amp. Two state


View Full Document

Berkeley COMPSCI 152 - Lecture 14 – Cache II

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 14 – Cache II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14 – Cache II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14 – Cache II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?