DOC PREVIEW
Berkeley COMPSCI 152 - Memory Hierarchy-II

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architecture andEngineering Lecture 7 - Memory Hierarchy-IIKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California at Berkeleyhttp://www.eecs.be rkeley.edu /~krstehttp://inst.eecs.b erkeley.ed u/~cs1522/14/2008 CS152-Spring!082Last time in Lecture 6• Dynamic RAM (DRAM) is main form of main memorystorage in use today– Holds values on small capacitors, need refreshing (hence dynamic)– Slow multi-step access: precharge, read row, read column• Static RAM (SRAM) is faster but more expensive– Used to build on-chip memory for caches• Caches exploit two forms of predictability in memoryreference streams– Temporal locality, same location likely to be accessed again soon– Spatial locality, neighboring location likely to be accessed soon• Cache holds small set of values in fast memory(SRAM) close to processor– Need to develop search scheme to find values in cache, andreplacement policy to make space for newly accessed locations2/14/2008 CS152-Spring!083Relative Memory Cell Sizes[ Foss, “ImplementingApplication-SpecificMemory”, ISSCC 1996 ]DRAM onmemorychipOn-ChipSRAM inlogic chip2/14/2008 CS152-Spring!084Placement Policy0 1 2 3 4 5 6 70 1 2 3Set NumberCache Fully (2-way) Set DirectAssociative Associative Mappedanywhere anywhere in only into set 0 block 4 (12 mod 4) (12 mod 8)0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 10 1 2 3 4 5 6 7 8 92 2 2 2 2 2 2 2 2 20 1 2 3 4 5 6 7 8 93 30 1MemoryBlock Numberblock 12 can be placed2/14/2008 CS152-Spring!08Direct-Mapped Cache Tag Data Block V =BlockOffset Tag Index t k b tHITData Word or Byte 2klines2/14/2008 CS152-Spring!082-Way Set-Associative Cache Tag Data Block V =BlockOffset Tag Index t k bHIT Tag Data Block VDataWordor Byte = t2/14/2008 CS152-Spring!08Fully Associative Cache Tag Data Block V =BlockOffset Tag t bHITDataWordor Byte = = t2/14/2008 CS152-Spring!088Replacement PolicyIn an associative cache, which block from a setshould be evicted when the set becomes full?• Random• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree often used for 4-8 way• First In, First Out (FIFO) a.k.a. Round-Robin• used in highly associative caches• Not Least Recently Used (NLRU)• FIFO with exception for most recently used block or blocksThis is a second-order effect. Why?Replacement only happens on misses2/14/2008 CS152-Spring!089Word3Word0 Word1 Word2Block Size and Spatial LocalityLarger block size has distinct hardware advantages• less tag overhead• exploit fast burst transfers from DRAM• exploit fast burst transfers over wide bussesWhat are the disadvantages of increasing block size?block address offsetb2b = block size a.k.a line size (in bytes)Split CPUaddressb bits32-b bitsTagBlock is unit of transfer between the cache and memory4 word block,b=22/14/2008 CS152-Spring!0810CPU-Cache Interaction(5-stage pipeline)PCaddrinstPrimaryInstructionCache0x4AddIRDnophit?PCenDecode,RegisterFetchwdataRaddrwdatardataPrimaryData CacheweABYYALUMD1MD2Cache Refill Data from Lower Levels ofMemory Hierarchyhit?Stall entireCPU on datacache missTo Memory ControlMEWhat about Instruction miss or writes to i-stream ?2/14/2008 CS152-Spring!0811Improving Cache PerformanceAverage memory access time =Hit time + Miss rate x Miss penaltyTo improve performance:• reduce the hit time• reduce the miss rate• reduce the miss penaltyWhat is the simplest design strategy?2/14/2008 CS152-Spring!0812Causes for Cache Misses• Compulsory: first-reference to a block a.k.a. cold start misses- misses that would occur even with infinite cache• Capacity: cache is too small to hold all data needed by the program- misses that would occur even under perfect replacement policy• Conflict: misses that occur because of collisions due to block-placement strategy- misses that would not occur with full associativity2/14/2008 CS152-Spring!0813Effect of Cache Parameters on Performance• Larger cache size• Higher associativity• Larger block size2/14/2008 CS152-Spring!0814Write Policy Choices• Cache hit:– write through: write both cache & memory» generally higher traffic but simplifies cache coherence– write back: write cache only(memory is written only when the entry is evicted)» a dirty bit per block can further reduce the traffic• Cache miss:– no write allocate: only write to main memory– write allocate (aka fetch on write): fetch into cache• Common combinations:– write through and no write allocate– write back with write allocate2/14/2008 CS152-Spring!0815Write Performance TagData V =BlockOffset Tag Index t k b tHITData Word or Byte 2klinesWE2/14/2008 CS152-Spring!0816Reducing Write Hit TimeProblem: Writes take two cycles in memory stage, onecycle for tag check plus one cycle for data write if hitSolutions:• Design data RAM that can perform read and write in one cycle,restore old value after tag miss• Fully-associative (CAM Tag) caches: Word line only enabled if hit• Pipelined writes: Hold write data for store in single buffer aheadof cache, write cache data during next store’s tag check2/14/2008 CS152-Spring!0817Pipelining Cache WritesTags DataTag Index Store DataAddress and Store Data From CPUDelayed Write DataDelayed Write Addr.=?=?Load Data to CPULoad/StoreLS1 0Hit?Data from a store hit written into data portion of cacheduring tag access of subsequent store2/14/2008 CS152-Spring!0818CS152 Administrivia• Krste, no office hours, Monday 2/18 (President’s DayHoliday)– email for alternate time• Henry office hours, 511 Soda– None on Monday due to holiday– 2:00-3:00PM Fridays• In-class quiz dates– Q1: Tuesday February 19 (ISAs, microcode, simple pipelining)» Material covered, Lectures 1-5, PS1, Lab 1• We’re stuck in this room for semester (nothing elseopen)2/14/2008 CS152-Spring!0819Write Buffer to Reduce Read Miss PenaltyProcessor is not stalled on writes, and read misses can goahead of write to main memoryProblem: Write buffer may hold updated value of location needed by a read missSimple scheme: on a read miss, wait for the write buffer to go emptyFaster scheme: Check write buffer addresses against read miss addresses, if nomatch, allow read miss to go ahead of writes, else, return value in write bufferDataCacheUnifiedL2CacheRFCPUWritebufferEvicted dirty


View Full Document

Berkeley COMPSCI 152 - Memory Hierarchy-II

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Memory Hierarchy-II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Memory Hierarchy-II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Memory Hierarchy-II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?