Unformatted text preview:

Systems I Locality and Caching Topics Locality of reference Cache principles Multi level caches Locality Principle of Locality Programs tend to reuse data and instructions near those they have used recently or that were recently referenced themselves Temporal locality Recently referenced items are likely to be referenced in the near future Spatial locality Items with nearby addresses tend to be referenced close together in time Locality Example sum 0 for i 0 i n i sum a i return sum Data Reference array elements in succession stride 1 reference pattern Spatial locality Reference sum each iteration Temporal locality Instructions Reference instructions in sequence Spatial locality Cycle through loop repeatedly Temporal locality 2 Locality Example Claim Being able to look at code and get a qualitative sense of its locality is a key skill for a professional programmer Question Does this function have good locality int sumarrayrows int a M N int i j sum 0 for i 0 i M i for j 0 j N j sum a i j return sum 3 Locality Example Question Does this function have good locality int sumarraycols int a M N int i j sum 0 for j 0 j N j for i 0 i M i sum a i j return sum 4 Locality Example Question Can you permute the loops so that the function scans the 3 d array a with a stride 1 reference pattern and thus has good spatial locality int sumarray3d int a M N N int i j k sum 0 for i 0 i M i for j 0 j N j for k 0 k N k sum a k i j return sum 5 Memory Hierarchies Some fundamental and enduring properties of hardware and software Fast storage technologies cost more per byte and have less capacity The gap between CPU and main memory speed is widening Well written programs tend to exhibit good locality These fundamental properties complement each other beautifully They suggest an approach for organizing memory and storage systems known as a memory hierarchy 6 An Example Memory Hierarchy Smaller faster and costlier per byte storage devices L0 registers L1 on chip L1 cache SRAM L2 L3 Larger slower and cheaper per byte storage devices L5 CPU registers hold words retrieved from L1 cache L4 off chip L2 cache SRAM L1 cache holds cache lines retrieved from the L2 cache memory L2 cache holds cache lines retrieved from main memory main memory DRAM Main memory holds disk blocks retrieved from local disks local secondary storage local disks Local disks hold files retrieved from disks on remote network servers remote secondary storage distributed file systems Web servers 7 Caches Cache A smaller faster storage device that acts as a staging area for a subset of the data in a larger slower device Fundamental idea of a memory hierarchy For each k the faster smaller device at level k serves as a cache for the larger slower device at level k 1 Why do memory hierarchies work Programs tend to access the data at level k more often than they access the data at level k 1 Thus the storage at level k 1 can be slower and thus larger and cheaper per bit Net effect A large pool of memory that costs as much as the cheap storage near the bottom but that serves data to programs at the rate of the fast storage near the top Use combination of small fast memory and big slow memory to give illusion of big fast memory 8 Caching in a Memory Hierarchy Level k 8 4 9 10 4 Level k 1 14 10 3 Smaller faster more expensive device at level k caches a subset of the blocks from level k 1 Data is copied between levels in block sized transfer units 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Larger slower cheaper storage device at level k 1 is partitioned into blocks 9 General Caching Concepts 14 12 Level k Request 12 14 0 1 2 3 Cache hit 4 12 9 14 3 12 4 Level k 1 Program needs object d which is stored in some block b Program finds b in the cache at level k E g block 14 Cache miss Request 12 0 1 2 3 4 4 5 6 7 8 9 10 11 12 13 14 15 b is not at level k so level k cache must fetch it from level k 1 E g block 12 If level k cache is full then some current block must be replaced evicted Which one is the victim Placement policy where can the new block go E g b mod 4 Replacement policy which block should be evicted E g LRU 10 General Caching Concepts Types of cache misses Cold compulsary miss Cold misses occur because the cache is empty Conflict miss Most caches limit blocks at level k 1 to a small subset sometimes a singleton of the block positions at level k E g Block i at level k 1 must be placed in block i mod 4 at level k 1 Conflict misses occur when the level k cache is large enough but multiple data objects all map to the same level k block E g Referencing blocks 0 8 0 8 0 8 would miss every time Capacity miss Occurs when the set of active cache blocks working set is larger than the cache 11 Examples of Caching in the Hierarchy Cache Type What Cached Where Cached Registers 4 byte word CPU registers 0 Compiler TLB Address translations 32 byte block 32 byte block 4 KB page On Chip TLB 0 Hardware On Chip L1 Off Chip L2 Main memory Parts of files Main memory 1 Hardware 10 Hardware 100 Hardware OS 100 OS L1 cache L2 cache Virtual Memory Buffer cache Network buffer Parts of files cache Browser Web pages cache Web cache Web pages Local disk Local disk Remote server disks Latency cycles Managed By 10 000 000 AFS NFS client 10 000 000 Web browser 1 000 000 000 Web proxy server 12 Cache Memories Cache memories are small fast SRAM based memories managed automatically in hardware Hold frequently accessed blocks of main memory CPU looks first for data in L1 then in L2 then in main memory Typical bus structure CPU chip register file L1 cache cache bus L2 cache ALU system bus memory bus bus interface I O bridge main memory 13 Inserting an L1 Cache Between the CPU and Main Memory The tiny very fast CPU register file has room for four 4 byte words The transfer unit between the CPU register file and the cache is a 4 byte block line 0 The small fast L1 cache has room for two 4 word blocks line 1 The transfer unit between the cache and main memory is a 4 word block 16 bytes block 10 abcd block 21 pqrs block 30 The big slow main memory has room for many 4 word blocks wxyz 14 Multi Level Caches Options separate data and instruction caches or a unified …


View Full Document

UT CS 429H - Systems I Locality and Caching

Loading Unlocking...
Login

Join to view Systems I Locality and Caching and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Systems I Locality and Caching and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?