UCR CS 162 - Lecture 11: Memory Hierarchy—Ways to Reduce Misses - D2399274

Home> Schools> University of California, Riverside> (CS) > CS 162> Lecture 11: Memory Hierarchy—Ways to Reduce Misses

DOC PREVIEW

UCR CS 162 - Lecture 11: Memory Hierarchy—Ways to Reduce Misses

School name University of California, Riverside

Course Cs 162- Computer Architecture

Pages 24

This preview shows page 1-2-23-24 out of 24 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture 11: Memory Hierarchy—Ways to Reduce MissesReview: Who Cares About the Memory Hierarchy?The Goal: Illusion of large, fast, cheap memoryRecap: Memory Hierarchy PyramidMemory Hierarchy: TerminologyCurrent Memory HierarchyMemory Hierarchy: Why Does it Work? Locality!Memory Hierarchy TechnologyIntroduction to CachesCachesCache OrganizationSimplest Cache: Direct MappedIssues with Direct-Mapped64KB Cache with 4-word (16-byte) blocksDirect-mapped Cache Contd.Another Extreme: Fully AssociativeFully Associative CacheCompromise: N-way Set Associative CacheExample: 2-way Set Associative CacheSet Associative Cache Contd.Addressing the CacheAlpha 21264 Cache OrganizationBlock Replacement PolicyReview: Four Questions for Memory Hierarchy DesignersDAP Spr.‘98 ©UCB 1Lecture 11: Memory Hierarchy—Ways to Reduce MissesDAP Spr.‘98 ©UCB 2Review: Who Cares About the Memory Hierarchy?µProc60%/yr.DRAM7%/yr.110100100019801981198319841985198619871988198919901991199219931994199519961997199819992000DRAMCPU1982Processor-MemoryPerformance Gap:(grows 50% / year)Performance“Moore’s Law”•Processor Only Thus Far in Course:–CPU cost/performance, ISA, Pipelined Execution CPU-DRAM Gap•1980: no cache in µproc; 1995 2-level cache on chip(1989 first Intel µproc with a cache on chip)DAP Spr.‘98 ©UCB 3The Goal: Illusion of large, fast, cheap memory•Fact: Large memories are slow, fast memories are small•How do we create a memory that is large, cheap and fast (most of the time)?•Hierarchy of Levels–Uses smaller and faster memory technologies close to the processor –Fast access time in highest level of hierarchy–Cheap, slow memory furthest from processor•The aim of memory hierarchy design is to have access time close to the highest level and size equal to the lowest levelDAP Spr.‘98 ©UCB 4Recap: Memory Hierarchy PyramidProcessor (CPU)Size of memory at each levelLevel 1Level 2Level nIncreasing Distance from CPU,Decreasing cost / MBLevel 3. . .transfer datapath: busDecreasing distance from CPU, Decreasing Access Time (Memory Latency)DAP Spr.‘98 ©UCB 5Memory Hierarchy: TerminologyHit: data appears in level X: Hit Rate: the fraction of memory accesses found in the upper levelMiss: data needs to be retrieved from a block in the lower level (Block Y) Miss Rate = 1 - (Hit Rate)Hit Time: Time to access the upper level which consists of Time to determine hit/miss + memory access time Miss Penalty: Time to replace a block in the upper level + Time to deliver the block to the processorNote: Hit Time << Miss PenaltyDAP Spr.‘98 ©UCB 6Current Memory HierarchyControlData-pathProcessorregsSecon-daryMem-oryL2CacheSpeed(ns): 0.5ns 2ns 6ns 100ns 10,000,000ns Size (MB): 0.0005 0.05 1-4 100-1000 100,000Cost ($/MB): -- $100 $30 $1 $0.05 Technology: Regs SRAM SRAM DRAM DiskL1 cacheMainMem-oryDAP Spr.‘98 ©UCB 7Memory Hierarchy: Why Does it Work? Locality!•Temporal Locality (Locality in Time):=> Keep most recently accessed data items closer to the processor•Spatial Locality (Locality in Space):=> Move blocks consists of contiguous words to the upper levels Lower LevelMemoryUpper LevelMemoryTo ProcessorFrom ProcessorBlk XBlk YAddress Space0 2^n - 1Probabilityof referenceDAP Spr.‘98 ©UCB 8Memory Hierarchy Technology•Random Access:–“Random” is good: access time is the same for all locations–DRAM: Dynamic Random Access Memory»High density, low power, cheap, slow»Dynamic: need to be “refreshed” regularly–SRAM: Static Random Access Memory»Low density, high power, expensive, fast»Static: content will last “forever”(until lose power)•“Not-so-random” Access Technology:–Access time varies from location to location and from time to time–Examples: Disk, CDROM•Sequential Access Technology: access time linear in location (e.g.,Tape)•We will concentrate on random access technology–The Main Memory: DRAMs + Caches: SRAMsDAP Spr.‘98 ©UCB 9Introduction to Caches•Cache–is a small very fast memory (SRAM, expensive)–contains copies of the most recently accessed memory locations (data and instructions): temporal locality–is fully managed by hardware (unlike virtual memory)–storage is organized in blocks of contiguous memory locations: spatial locality–unit of transfer to/from main memory (or L2) is the cache block•General structure–n blocks per cache organized in s sets–b bytes per block–total cache size n*b bytesDAP Spr.‘98 ©UCB 10Caches•For each block:–an address tag: unique identifier–state bits:»(in)valid»modified–the data: b bytes•Basic cache operation–every memory access is first presented to the cache–hit: the word being accessed is in the cache, it is returned to the cpu–miss: the word is not in the cache, »a whole block is fetched from memory (L2)»an “old” block is evicted from the cache (kicked out), which one?»the new block is stored in the cache»the requested word is sent to the cpuDAP Spr.‘98 ©UCB 11Cache Organization(1) How do you know if something is in the cache?(2) If it is in the cache, how to find it?•Answer to (1) and (2) depends on type or organization of the cache•In a direct mapped cache, each memory address is associated with one possible block within the cache–Therefore, we only need to look in a single location in the cache for the data if it exists in the cacheDAP Spr.‘98 ©UCB 12Simplest Cache: Direct MappedMainMemory4-Block Direct Mapped CacheBlockAddress0123456789101112131415Cache Index01230010011010101110•index determines block in cache•index = (address) mod (# blocks)•If number of cache blocks is power of 2, then cache index is just the lower n bits of memory address [ n = log2(# blocks) ]tagindexMemory block addressDAP Spr.‘98 ©UCB 13Issues with Direct-Mapped•If block size > 1, rightmost bits of index are really the offset within the indexed blockttttttttttttttttt iiiiiiiiii oooo tag index byteto check to offsetif have select withincorrect block block blockDAP Spr.‘98 ©UCB 1416 12 ByteoffsetHit Data16324Kentries16 bits 128 bitsMux32 32 32232Block offsetIndexTagAddress (showing bit positions)31 . . . 16 15 . . 4 3 2 1 064KB Cache with 4-word (16-byte) blocksTag DataVDAP Spr.‘98 ©UCB 15Direct-mapped Cache Contd.•The direct mapped cache is simple to design and its access time is fast (Why?)•Good for L1 (on-chip cache)•Problem: Conflict Miss, so low hit ratioConflict Misses are misses caused by accessing different memory locations that are

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-23-24 out of 24 pages.

UCR CS 162 - Lecture 11: Memory Hierarchy—Ways to Reduce Misses

Sign up for free to view:

Please select your school