U of I CS 232 - A simple cache design - D2828179

Home> Schools> University of Illinois> Computer Science (CS) > CS 232> A simple cache design

DOC PREVIEW

U of I CS 232 - A simple cache design

School name University of Illinois

Course Cs 232- Computer Architecture II

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

A simple cache designFour important questionsWhere should we put data in the cache?Least-significant bitsHow can we find data in the cache?Adding tagsFiguring out what’s in the cacheOne more detail: the valid bitWhat happens on a cache hitWhat happens on a cache missLoading a block into the cacheWhat if the cache fills up?Memory System PerformanceAverage memory access timePerformance exampleSpatial localityBlock addressesCache mappingData placement within a blockLocating data in the cacheSummary1A simple cache designCaches are divided into blocks, which may be of various sizes.—The number of blocks in a cache is usually a power of 2.—For now we’ll say that each block contains one byte. This won’t take advantage of spatial locality, but we’ll do that next time.Here is an example cache with eight blocks, each holding one byte.000001010011100101110111index8-bit dataindex == row2Four important questions1. When we copy a block of data from main memory to the cache, where exactly should we put it?2. How can we tell if a word is already in the cache, or if it has to be fetched from main memory first?3. Eventually, the small cache memory might fill up. To load a new block from main RAM, we’d have to replace one of the existing blocks in the cache... which one?4. How can write operations be handled by the memory system?Questions 1 and 2 are related—we have to know where the data is placed if we ever hope to find it again later!3Where should we put data in the cache?A direct-mapped cache is the simplest approach: each main memory address maps to exactly one cache block.For example, on the rightis a 16-byte main memoryand a 4-byte cache (four1-byte blocks).Memory locations 0, 4, 8and 12 all map to cacheblock 0.Addresses 1, 5, 9 and 13map to cache block 1, etc.How can we compute thismapping?0123Index0123456789101112131415MemoryAddress4Least-significant bitsIf the cache contains 2k blocks, then the data at memory address i wouldgo to cache block index i mod 2kAn equivalent way to findthe placement of a memoryaddress in the cache is tolook at the least significantk bits of the address.With our four-byte cachewe would inspect the twoleast significant bits ofour memory addresses.Again, you can see thataddress 14 (1110 in binary)maps to cache block 2(10 in binary).Taking the least k bits ofa binary value is the sameas computing that valuemod 2k.00011011Index0000000100100011010001010110011110001001101010111100110111101111MemoryAddress5The second question was how to determine whether or not the data we’re interested in is already stored in the cache.If we want to read memoryaddress i, we can use themod trick to determinewhich cache block wouldcontain i.But other addresses mightalso map to the same cacheblock. How can wedistinguish between them?For instance, cache block2 could contain data fromaddresses 2, 6, 10 or 14.How can we find data in the cache?IndexMemoryAddress0000000100100011010001010110011110001001101010111100110111101111000110116Adding tagsWe need to add tags to the cache, which supply the rest of the address bits to let us distinguish between different memory locations that map to the same cache block.00011011Index0000000100100011010001010110011110001001101010111100110111101111Tag Data001101017Figuring out what’s in the cacheNow we can tell exactly which addresses of main memory are stored in the cache, by concatenating the cache block tags with the block indices.00011011Index Tag Data0011010100 + 00 = 000011 + 01 = 110101 + 10 = 011001 + 11 = 0111Main memoryaddress in cache block8One more detail: the valid bitWhen started, the cache is empty and does not contain valid data.We should account for this by adding a valid bit for each cache block.—When the system is initialized, all the valid bits are set to 0.—When data is loaded into a particular cache block, the corresponding valid bit is set to 1.So the cache contains more than just copies of the data in memory; it also has bits to help us find data within the cache and verify its validity.00011011Index Tag Data0011010100 + 00 = 0000InvalidInvalid01 + 11 = 0111Main memoryaddress in cache block1001ValidBit9What happens on a cache hitWhen the CPU tries to read from memory, the address will be sent to a cache controller.—The lowest k bits of the address will index a block in the cache.—If the block is valid and the tag matches the upper (m - k) bits of the m-bit address, then that data will be sent to the CPU.Here is a diagram of a 32-bit memory address and a 210-byte cache.0123......10221023Index Tag DataValidAddress (32 bits)=To CPUHit1022IndexTag10What happens on a cache missThe delays that we’ve been assuming for memories (e.g., 2ns) are really assuming cache hits.—If our CPU implementations accessed main memory directly, their cycle times would have to be much larger. —Instead we assume that most memory accesses will be cache hits, which allows us to use a shorter cycle time.However, a much slower main memory access is needed on a cache miss. The simplest thing to do is to stall the pipeline until the data from main memory can be fetched (and also copied into the cache).11Loading a block into the cacheAfter data is read from main memory, putting a copy of that data into the cache is straightforward.—The lowest k bits of the address specify a cache block.—The upper (m - k) address bits are stored in the block’s tag field.—The data from main memory is stored in the block’s data field.—The valid bit is set to 1.0123... ......Index Tag DataValidAddress (32 bits)1022IndexTagData112What if the cache fills up?Our third question was what to do if we run out of space in our cache, or if we need to reuse a block for a different memory address.We answered this question implicitly on the last page!—A miss causes a new block to be loaded into the cache, automatically overwriting any previously stored data.—This is a least recently used replacement policy, which assumes that older data is less likely to be requested than newer data.13Memory System PerformanceTo examine the performance of a memory system, we need to focus on a couple of important factors.—How long does it take to send data from the cache to the CPU?—How long does it take to copy data from memory into the cache?—How often do we have to access main memory?There are names for all of these variables.—The hit time is how long it takes

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-20-21 out of 21 pages.

U of I CS 232 - A simple cache design

Sign up for free to view:

Please select your school