Princeton COS 217 - Memory Hierarchy - D2915824

Home> Schools> Princeton University> Computer Science (COS) > COS 217> Memory Hierarchy

DOC PREVIEW

Princeton COS 217 - Memory Hierarchy

School name Princeton University

Course Cos 217- Algorithms and Data Structures

Pages 29

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Memory HierarchyGoals of Today’s LectureMotivation for Memory HierarchySimple Three-Level HierarchyWidening Processor/Memory GapAn Example Memory HierarchyLocality of ReferenceLocality Makes Caching EffectiveCaching in a Memory HierarchyCache Block SizesCache Hit and MissThree Kinds of Cache MissesCache ReplacementWho Manages the Cache?Manual Allocation: SegmentationAutomatic Allocation: Virtual MemoryMaking Good Use of Memory and DiskVirtual Address for a ProcessVirtual Memory for a ProcessPage Table to Manage the Cache“Miss” Triggers Page Fault ExceptionOS Handles the Page FaultVM as a Tool for Memory ProtectionSharing Physical MemoryProcess-ID and Page Table EntriesPage Tables in OS Memory...Measuring the Memory UsageVM as a Tool for Memory ManagementConclusion1Memory HierarchyProfessor Jennifer Rexfordhttp://www.cs.princeton.edu/~jrex2Goals of Today’s Lecture•Memory hierarchyFrom fast/expensive/small to slow/cheap/big memory technologyRegisters, on-chip cache, off-chip cache, main memory, disk, tape•Locality of referenceSpatial and temporal locality, of program data and instructionsCaching to store small number of recently-used memory blocks•Virtual memorySeparation of virtual addresses and physical memory locationsMain memory as a cache of virtual pages from the diskMemory protection from misbehaving user processes3Motivation for Memory Hierarchy•Faster storage technologies are more costlyCost more money per byteHave lower storage capacityRequire more power and generate more heat•The gap between processing and memory is wideningProcessors have been getting faster and fasterMain memory speed is not improving as dramatically•Well-written programs tend to exhibit good localityAcross time: repeatedly referencing the same variablesAcross space: often accessing other variables located nearby Want the speed of fast storage at the cost and capacity of slow storage. Key idea: memory hierarchy!4Simple Three-Level Hierarchy•RegistersUsually reside directly on the processor chipEssentially no latency, referenced directly in instructionsLow capacity (e.g., 32-512 bytes)•Main memoryAround 100 times slower than a clock cycleConstant access time for any memory locationModest capacity (e.g., 512 MB-2GB)•DiskAround 100,000 times slower than main memoryFaster when accessing many bytes in a rowHigh capacity (e.g., 200 GB)5Widening Processor/Memory Gap•Gap in speed increasing from 1986 to 2000CPU speed improved ~55% per yearMain memory speed improved only ~10% per year•Main memory as major performance bottleneckMany programs stall waiting for reads and writes to finish•Changes in the memory hierarchyIncreasing the number of registers–8 integer registers in the x86 vs. 128 in the ItaniumAdding caches between registers and main memory–On-chip level-1 cache and off-chip level-2 cacheAn Example Memory Hierarchyregisterson-chip L1cache (SRAM)main memory(DRAM)local secondary storage(local disks)Larger, slower, and cheaper (per byte)storagedevicesremote secondary storage(tapes, distributed file systems, Web servers)Local disks hold files retrieved from disks on remote network servers.Main memory holds disk blocks retrieved from local disks.off-chip L2cache (SRAM)L1 cache holds cache lines retrieved from the L2 cache memory.CPU registers hold words retrieved from L1 cache.L2 cache holds cache lines retrieved from main memory.L0:L1:L2:L3:L4:L5:Smaller,faster,and costlier(per byte)storage devices7Locality of Reference•Two kinds of localityTemporal locality: recently-referenced items are likely to be referenced in near futureSpatial locality: Items with nearby addresses tend to be referenced close together in time.•Locality exampleProgram data–Temporal: the variable sum–Spatial: variable a[i+1] accessed soon after a[i]Instructions–Temporal: cycle through the for-loop repeatedly–Spatial: reference instructions in sequencesum = 0;for (i = 0; i < n; i++)sum += a[i];return sum;8Locality Makes Caching Effective•CacheSmaller, faster storage device that acts as a staging area … for a subset of the data in a larger, slower device•Caching and the memory hierarchyStorage device at level k is a cache for level k+1Registers as cache of L1/L2 cache and main memoryMain memory as a cache for the diskDisk as a cache of files from remote storage•Locality of access is the keyMost accesses satisfied by first few (faster) levelsVery few accesses go to the last few (slower) levelsCaching in a Memory Hierarchy0 1 2 34 5 6 78 9 10 1112 13 14 15Larger, slower, cheaper storagedevice at level k+1 is partitionedinto blocks.Data copied between levels in block-sized transfer units9 3Smaller, faster, more expensivedevice at level k caches a subsetof the blocks from level k+1Level k:Level k+1:44 101010Cache Block Sizes•Fixed vs. variable sizeFixed-sized blocks are easier to manage (common case)Variable-sized blocks make more efficient use of storage•Block sizeDepends on access times at the level k+1 deviceLarger block sizes further down in the hierarchyE.g., disk seek times are slow, so disk pages are larger•ExamplesCPU registers: 4-byte wordsL1/L2 cache: 32-byte blocksMain memory: 4 KB pagesDisk: entire files11Cache Hit and Miss•Cache hitProgram accesses a block available in the cacheSatisfy directly from cacheE.g., request for “10”•Cache missProgram accesses a block not available in the cacheBring item into the cacheE.g., request for “13”•Where to place the item?•Which item to evict?0 1 2 34 5 6 78 9 10 1112 13 14 1589 14 3Level k:Level k+1:44101012Three Kinds of Cache Misses•Cold (compulsory) missCold misses occur because the block hasn’t been accessed beforeE.g., first time a segment of code is executedE.g., first time a particular array is referenced•Capacity missSet of active cache blocks (the “working set”) is larger than cacheE.g., manipulating a 1200-byte array within a 1000-byte cache•Conflict missSome caches limit the locations where a block can be storedE.g., block i must be placed in cache location (i mod 4)Conflicts occur when multiple blocks map to the same location(s)E.g., referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time13Cache Replacement•Evicting a block from the cacheNew block must be brought into the cacheMust

View Full Document