DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 14 – Cache II

This preview shows page 1-2-3-4-5-33-34-35-36-67-68-69-70-71 out of 71 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 71 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 L14: Cache II UC Regents Fall 2006 © UCB2006-10-17John Lazzaro (www.cs.berkeley.edu/~lazzaro)CS 152 Computer Architecture and EngineeringLecture 14 – Cache IIwww-inst.eecs.berkeley.edu/~cs152/And also, final project introductionTAs: Udam Saini and Jue Sun 1UC Regents Fall 2006 © UCBCS 152 L14: Cache IILast Time: Locality encourages cachingDonald J. Hatfield, Jeanette Gerald: Program Restructuring for Virtual Memory. IBM Systems Journal 10(3): 168-192 (1971)TimeMemory Address (one dot per access)SpatialLocalityTemporal LocalityBad2UC Regents Fall 2006 © UCBCS 152 L14: Cache IIToday ... Caches ReloadedCache misses and performance: how do we size the cache?Practical cache design: a state machine and a controller.The cache-DRAM interfaceWrite buffers and cachesAnd also, final project introduction3UC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: Color-coding main memoryBlock #71234560227- 1...32-byte blocksBlocks of a certain color may only appear in one line of the cache.32-bit Memory AddressWhich block?ColorByte #031456725 bits 2 bits5 bitsCache index4UC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: A Direct Mapped CacheCache Tag (25 bits)IndexByte Select531 04=HitEx: 0x01Return bytes of “hit” cache lineEx: 0x00PowerPC 970: 64K direct-mapped Level-1 I-cache67ValidBitByte 31...Byte 1Byte 0Byte 31...Byte 1Byte 0Cache Tags024Cache Data5UC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: Set Associative CacheCache Tag (26 bits)Index(2 bits)Byte Select (4 bits)Cache block halved to keep # cached bits constant.ValidCache BlockCache BlockCache Tags Cache DataCache BlockCache BlockCache TagsValidCache DataEx: 0x01=HitRight=HitLeftReturn bytes of “hit” set member“N-way” set associative -- N is number of blocks for each color16 bytes16 bytesPowerPC 970: 32K 2-wayset associative L1 D-cache6CS 152 L14: Cache II UC Regents Fall 2006 © UCBCache MissesPerformance&7UC Regents Fall 2006 © UCBCS 152 L14: Cache II Recall: Performance EquationSecondsProgram InstructionsProgram=SecondsCycle InstructionCyclesAssumes aconstant memory access time.True CPI depends on theAverage Memory Access Time (AMAT) for Inst & DataAMAT = Hit Time + (Miss Rate x Miss Penalty)MultiplyOther ALULoadStoreBranch22215 Machine CPIEarlier, computed from ...Goal: Reduce AMATBeware! Improving one term may hurt other terms, and increase AMAT!True CPI = Ideal CPI +Memory Stall Cycles.See Section 7.3, COD/3e for details.8UC Regents Fall 2006 © UCBCS 152 L14: Cache II One type of cache miss: Conflict Miss N blocks of same color in use at once, but cache can only hold M < N of them Solution: Increase M(Associativity)fully-associativeMissRateCache Size (KB)Miss rate improvementequivalent to doublingcache size. Other SolutionsIncrease number of cache lines (# blocks in cache)Q. Why does this help?Add a small “victim cache” that holds blocks recently removed from the cache.Q. Why does this help?AMAT = Hit Time + (Miss Rate x Miss Penalty)If hit time increases, AMAT may go up!9UC Regents Fall 2006 © UCBCS 152 L14: Cache II Other causes of cache misses ...Solution: Prefetch blocks(via hardware, software) Capacity Misses Cache cannot contain all blocks accessed by the programSolution: Increase size of the cache Compulsory Misses First access of a block by a programMostly unavoidableMiss rates(absolute)Cache Size (KB)Miss rates (relative)Cache Size (KB)Also “Coherency Misses”: other processes update memory10UC Regents Fall 2006 © UCBCS 152 L14: Cache II Thinking about cache miss types ...What kind of misses happen in a fully associative cache of infinite size?A. Compulsory misses. Must bring each block into cache.In addition, what kind of misses happen in a finite-sized fully associative cache?A. Capacity misses. Program may use more blocks than can fit in cache.In addition, what kind of misses happen in a set-associative or direct-map cache?A. Conflict misses.(all questions assume the replacement policy used is considered “optimal”)11CS 152 L14: Cache II UC Regents Fall 2006 © UCBPractical Cache Design12UC Regents Fall 2006 © UCBCS 152 L14: Cache II Cache Design: Datapath + ControlToCPUToLowerLevelMemoryToCPUToLowerLevelMemoryTagsBlocksAddrDinDoutAddrDinDoutState MachineControlControlControlDatapath for performance, control for correctness.Most design errors come from incorrect specification of state machine behavior!Red text will highlight state machine requirements ...13UC Regents Fall 2006 © UCBCS 152 L14: Cache IIRecall: State Machine Design ...Change == 1Change == 1 Change == 1R Y G1 0 0 R Y G0 0 1 R Y G0 1 0 Rst == 1Cache controller state machines like this, but more states, and perhaps several connected machines ...14UC Regents Fall 2006 © UCBCS 152 L14: Cache IIIssue #1: Control for CPU interface ....Lower LevelMemoryUpper LevelMemoryTo ProcessorFrom ProcessorBlk XBlk YSmall, fast Large, slow FromCPUTo CPUFor reads,your state machine must: (1) sense REQ(2) latch Addr(3) create Wait(4) put Data Out on the bus.An example interface ... there are other possibilities.15UC Regents Fall 2006 © UCBCS 152 L14: Cache II Issue #2: Cache Block Replacement After a cache read miss, if there are no empty cache blocks, which block should be removed from the cache?A randomly chosen block?Easy to implement, how well does it work?The Least Recently Used (LRU) block? Appealing,but hard to implement.SizeRandomLRU16 KB5.7%5.2%64 KB2.0%1.9%256 KB1.17%1.15%Miss Rate for 2-way Set Associative CacheAlso,tryotherLRUapprox.Part of your state machine decides which block to replace.16UC Regents Fall 2006 © UCBCS 152 L14: Cache II4096 rows1 of 4096 decoder2048 columnsEach column 4 bits deep33,554,432 usable bits(tester found good bits in bigger array)12-bitrow address input8196 bits delivered by sense ampsSelect requested bits, send off the chipIssue #3: High performance block fetchWith proper memory layout, one row access delivers entire cache block to the sense amp. Two state machine challenges: (1) Bring in the word requested by CPU with lowest latency (2) Bring in rest of cache block ASAP17UC Regents Fall 2006 © UCBCS 152 L14: Cache IIIssue #3 (continued): DRAM Burst Reads20256M b: x4, x8, x16 SDRAM Micron Technology, Inc., reserves the right to change product s or specif ications w it hout not ice.256M SDRAM _G.p65 – Rev. G; Pub. 9/03 ©2003, M icron Technology, Inc.256M b: x4, x8, x16SDRAMva lid , wh ere x eq u a ls th e CAS la te n cy m in u


View Full Document

Berkeley COMPSCI 152 - Lecture 14 – Cache II

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 14 – Cache II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14 – Cache II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14 – Cache II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?