DOC PREVIEW
Berkeley COMPSCI 152 - Memory Hierarchy - II

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-IILast time in Lecture 6Relative Memory Cell SizesPlacement PolicyDirect-Mapped Cache2-Way Set-Associative CacheFully Associative CacheReplacement PolicyBlock Size and Spatial LocalityCPU-Cache Interaction (5-stage pipeline)Improving Cache PerformanceCauses for Cache MissesEffect of Cache Parameters on PerformanceWrite Policy ChoicesWrite PerformanceReducing Write Hit TimePipelining Cache WritesCS152 AdministriviaWrite Buffer to Reduce Read Miss PenaltySerial-versus-Parallel Cache and Memory accessBlock-level OptimizationsSet-Associative RAM-Tag CacheAcknowledgementsCS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-IIKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California at Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.edu/~cs1522/14/2008 CS152-Spring’082Last time in Lecture 6•Dynamic RAM (DRAM) is main form of main memory storage in use today–Holds values on small capacitors, need refreshing (hence dynamic)–Slow multi-step access: precharge, read row, read column•Static RAM (SRAM) is faster but more expensive–Used to build on-chip memory for caches•Caches exploit two forms of predictability in memory reference streams–Temporal locality, same location likely to be accessed again soon–Spatial locality, neighboring location likely to be accessed soon•Cache holds small set of values in fast memory (SRAM) close to processor–Need to develop search scheme to find values in cache, and replacement policy to make space for newly accessed locations2/14/2008 CS152-Spring’083Relative Memory Cell SizesQuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.[ Foss, “Implementing Application-Specific Memory”, ISSCC 1996 ]QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.DRAM on memory chipOn-Chip SRAM in logic chip2/14/2008 CS152-Spring’084Placement Policy0 1 2 3 4 5 6 70 1 2 3Set NumberCache Fully (2-way) Set DirectAssociative Associative Mappedanywhere anywhere in only into set 0 block 4 (12 mod 4) (12 mod 8)0 1 2 3 4 5 6 7 8 91 1 1 1 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 92 2 2 2 2 2 2 2 2 2 0 1 2 3 4 5 6 7 8 93 30 1MemoryBlock Numberblock 12 can be placed2/14/2008 CS152-Spring’08Direct-Mapped Cache Tag Data Block V =BlockOffset Tag Index t k b tHITData Word or Byte 2klines2/14/2008 CS152-Spring’082-Way Set-Associative Cache Tag Data Block V =BlockOffset Tag Index t k bHIT Tag Data Block VDataWordor Byte = t2/14/2008 CS152-Spring’08Fully Associative Cache Tag Data Block V =BlockOffset Tag t bHITDataWordor Byte = = t2/14/2008 CS152-Spring’088Replacement PolicyIn an associative cache, which block from a set should be evicted when the set becomes full?• Random• Least Recently Used (LRU)• LRU cache state must be updated on every access• true implementation only feasible for small sets (2-way)• pseudo-LRU binary tree often used for 4-8 way• First In, First Out (FIFO) a.k.a. Round-Robin• used in highly associative caches• Not Least Recently Used (NLRU)• FIFO with exception for most recently used block or blocksThis is a second-order effect. Why?Replacement only happens on misses2/14/2008 CS152-Spring’089Word3Word0 Word1 Word2Block Size and Spatial LocalityLarger block size has distinct hardware advantages• less tag overhead• exploit fast burst transfers from DRAM• exploit fast burst transfers over wide bussesWhat are the disadvantages of increasing block size?block address offsetb2b = block size a.k.a line size (in bytes)Split CPU addressb bits32-b bitsTagBlock is unit of transfer between the cache and memory4 word block, b=2Fewer blocks => more conflicts. Can waste bandwidth.2/14/2008 CS152-Spring’0810CPU-Cache Interaction(5-stage pipeline)PCaddrinstPrimaryInstructionCache0x4AddIRDnophit?PCenDecode,RegisterFetchwdataRaddrwdatardataPrimaryData CacheweABYYALUMD1MD2Cache Refill Data from Lower Levels of Memory Hierarchyhit?Stall entire CPU on data cache missTo Memory ControlMEWhat about Instruction miss or writes to i-stream ?2/14/2008 CS152-Spring’0811Improving Cache PerformanceAverage memory access time =Hit time + Miss rate x Miss penaltyTo improve performance:•reduce the hit time•reduce the miss rate•reduce the miss penaltyWhat is the simplest design strategy?Biggest cache that doesn’t increase hit time past 1-2 cycles (approx 8-32KB in modern technology)[design issues more complex with out-of-order superscalar processors]2/14/2008 CS152-Spring’0812Causes for Cache Misses• Compulsory: first-reference to a block a.k.a. cold start misses- misses that would occur even with infinite cache• Capacity: cache is too small to hold all data needed by the program- misses that would occur even under perfect replacement policy• Conflict: misses that occur because of collisions due to block-placement strategy- misses that would not occur with full associativity2/14/2008 CS152-Spring’0813Effect of Cache Parameters on Performance• Larger cache size+ reduces capacity and conflict misses - hit time will increase• Higher associativity+ reduces conflict misses- may increase hit time• Larger block size+ reduces compulsory and capacity (reload) misses- increases conflict misses and miss penalty2/14/2008 CS152-Spring’0814Write Policy Choices •Cache hit:–write through: write both cache & memory»generally higher traffic but simplifies cache coherence–write back: write cache only (memory is written only when the entry is evicted)»a dirty bit per block can further reduce the traffic•Cache miss:–no write allocate: only write to main memory–write allocate (aka fetch on write): fetch into cache•Common combinations:–write through and no write allocate–write back with write allocate2/14/2008 CS152-Spring’0815Write Performance TagData V =BlockOffset Tag Index t k b tHITData Word or Byte 2klinesWE2/14/2008 CS152-Spring’0816Reducing Write Hit TimeProblem: Writes take two cycles in memory stage, one cycle for tag check plus one cycle for data write if hitSolutions:•Design data RAM that can perform read and write in one cycle, restore old value after tag miss•Fully-associative (CAM Tag) caches: Word line only enabled if hit•Pipelined writes: Hold write data for store in single buffer ahead of cache, write cache


View Full Document

Berkeley COMPSCI 152 - Memory Hierarchy - II

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Memory Hierarchy - II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Memory Hierarchy - II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Memory Hierarchy - II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?