Gordon CPS 311 - MEMORY HIERARCHIES

Unformatted text preview:

CS311 Lecture: Memory HierarchiesOctober 26, 2009Objectives:1. To introduce cache memory2. To introduce logical-physical address mapping3. To introduce virtual memoryMaterials1. Memory System Demo2. Files for demo: CAMCache1ByteLine.parameters, CAMCache8ByteLine.parameters, DMCache.parameters, SACache.parameters, WBCache.parameters, NonVirtualNoTLB.parameters, NonVirtualWithTLB.system, Virtual.parameters, Full.parametersI. IntroductionA. In the previous lecture, we looked at the basic building blocks of memory systems: the individual devices: chips, disks, tapes etc. We now focus on complete memory systems. B. Since every instruction executed by the CPU requires at least one memory access (to fetch the instruction) and often more, the performance of memory has a strong impact on system performance. In fact, the overall system cannot perform better than its memory system.1. Note that we are distinguishing between the CPU and the memory system on a functional basis. The CPU accesses memory both when fetching an instruction, and as a result of executing instructions that reference memory (such as lw and sw on MIPS).2. We will consider the memory system to be a logical unit separate from the CPU, even though it is often the case that, for performance reasons, some portions of it physically reside on the same chip as the CPU.C. At one point in time, the speed of the dominant technology used for memory was well matched to the speed of the CPU. This, however, is not the case today (and has not been the case - in at least some portions of the computer system landscape - for decades).11. Consider the situation as it exists today.a) The dominant technology used for building main memories is dynamic RAM chips. However, DRAM chips have a typical access time of 60-80 nanoseconds, and a cycle time about twice that. Thus, if all instructions were contained in such memory, the rate at which instructions could be fetched from memory would be less than 20 million per second.b) But today’s CPU’s are capable of executing instructions at rates in excess of 1 billion per second - a 50 to 1 (or worse) speed mismatch! If a main memory based on DRAM were the only option, there would have been no reason to build CPU’s faster than we had in the early 1990’s!2. With present technologies, it does turn out to be possible to build very fast memories (that can deliver instructions as fast as the CPU can execute them), but only of very limited size. For example: static RAM on the same chip as the CPU can function at the same speed as other CPU components. However, static RAM consumes a lot of power, since one side of the flip-flop for each bit is always on, And this generates a lot of heat; therefore, the amount of static RAM that can be put on the CPU chip is relatively small (typically < 100 KB).3. OTOH, it is also possible to build very large memories, but using technologies that are quite slow compared to that of the CPU.For example hard disks can store 100’s of GB of data for minimal cost. But hard disk is slow - a typical access time of about 10 ms, which is 107 times as long as a clock cycle on a 1 GHz CPU!4. This speed versus capacity and cost tradeoff has been true throughout most of the history of computer technology, even though specific devices have varied. [ If it ever became possible to produce memory that was both very fast and very large, this lecture topic would go away! ]D. We will see that memory systems are seldom composed of just one type of memory; instead, they are HIERARCHICAL systems composed of a mixture of technologies aimed at achieving a good tradeoff between speed, capacity, cost, and physical size. 1. The goal is to produce an overall system that exhibits the speed of the fastest technology [ at least for most accesses!] and the capacity of the largest technology. 22. This goal is achievable because, at any point in time, a program is typically only accessing a very small subset of the total memory it uses - a principle known as LOCALITY.a) TEMPORAL LOCALITY: most of a programs references to memory are to locations it has accessed recently.This is because a program spends most of its time executing various loops. While executing a loop, the program keeps executing a relatively small group of instructions over and over, and typically accesses a small set of data items over and over as well. b) SPATIAL LOCALITY: if a program references a location in memory, it is likely to reference other locations that are physically near it.This arises because instructions occur sequentially in memory, and data structures such as objects and arrays occupy sequential locations in memory.3. Thus, the instructions and data the program is currently using are typically small enough to be kept in the fastest kind of memory.E. A typical computer system may have three basic kinds of memory, which constitute different levels in the hierarchyCache memorySmall but fastMain memory(Typically called RAM)Paging fileVery largebutslow31. Different technologies are used for each level.a) Main memory is typically implemented using DRAMb) Cache memory is typically implemented using SRAM - sometimes on the same chip as the CPU; sometimes on separate chips. Today’s systems often have two or even three levels of cache, (called L1, L2, and sometimes L3 cache). (However, we will develop some of our examples using just a single level of cache for simplicity)c) The paging file resides on disk.2. Only one of these levels is actually necessary - main memory.a) Historically, at one time personal computers needed only this. b) Today, embedded systems often have only this.c) In fact, from an interface standpoint, the memory system is made to look to the CPU as if it were entirely main memory. (Cache speeds things up, and virtual memory makes the memory system appear larger than main memory physically is - but the overall system looks like RAM to the CPU).3. Each item has a logical address, whose size is dictated by the ISA of the CPU. (For example, if the CPU is a 32 bit, then the logical address of an item is a 32 bit number). CPULogicalAddress DataMemorySystem4a) If the memory system consisted only of one level, then the physical address of an item in main memory would be the same as its logical address. (And a logical address which did not correspond to any physical address would be an error - e.g. if a certain system had only 1 GB of main memory, then any address greater than 0xf3ffffff


View Full Document

Gordon CPS 311 - MEMORY HIERARCHIES

Download MEMORY HIERARCHIES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MEMORY HIERARCHIES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MEMORY HIERARCHIES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?