UW CSE 332 - Memory Hierarchy - D439552

Home> Schools> University of Washington> Computer Science and Engineering (CSE) > CSE 332> Memory Hierarchy

DOC PREVIEW

UW CSE 332 - Memory Hierarchy

School name University of Washington

Course Cse 332- Data Abstractions

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CSE332: Data AbstractionsLecture 8: Memory HierarchyTyler RobisonSummer 20101Now what?2 We have a data structure for the dictionary ADT that has worst-case O(log n) behavior One of several interesting/fantastic balanced-tree approaches We are about to learn another balanced-tree approach: B Trees First, to motivate why B trees are better for really large dictionaries (say, over 1GB = 230bytes), need to understand some memory-hierarchy basics Don’t always assume “every memory access has an unimportant O(1) cost” Learn more in CSE351/333/471 (and CSE378), focus here on relevance to data structures and efficiencyA typical hierarchy3“Every desktop/laptop/server is different” but here is a plausible configuration these daysCPUDisk: 1TB = 240Main memory: 2GB = 231L2 Cache: 2MB = 221L1 Cache: 128KB = 217instructions (e.g., addition): 230/secget data in L1: 229/sec = 2 instructionsget data in L2: 225/sec = 30 inst get data in main memory:222/sec = 250 instget data from “new place” on disk:27/sec =8,000,000 inst“streamed”: 218/secMorals4It is much faster to do: Than:5 million arithmetic ops 1 disk access2500 L2 cache accesses 1 disk access400 main memory accesses 1 disk accessWhy are computers built this way? Physical realities (speed of light, closeness to CPU) Cost (price per byte of different technologies) Disks get much bigger not much faster Spinning at 7200 RPM accounts for much of the slowness and unlikely to spin faster in the future Speedup at higher levels makes lower levels relativelyslower Later in the course: more than 1 CPU!“Fuggedaboutit”, usually5The hardware automatically moves data into the caches from main memory for you Replacing items already there So algorithms much faster if “data fits in cache” (often does)Disk accesses are done by software (e.g., ask operating system to open a file or database to access some data)So most code “just runs” but sometimes it’s worth designing algorithms / data structures with knowledge of memory hierarchy And when you do, you often need to know one more thing…Block/line size6 Moving data up the memory hierarchy is slow because of latency (think distance-to-travel) Since we’re making the trip anyway, may as well carpool Get a block of data in the same time it would take to get a byte What to send? How about nearby memory: It’s easy (close by) And likely to be asked for soon (spatial locality) Side note: Once in cache, may as well keep it around for awhile; accessed once, a value is more likely to be accessed again in the near future (more likely than some random other value): temporal localityBlock/line size7 The amount of data moved from disk into memory is called the “block” size or the “(disk) page” size Not under program control The amount of data moved from memory into cache is called the “line” size As in “cache line” Not under program control Not under our control, but good to be aware ofConnection to data structures8 An array benefits more than a linked list from block moves Language (e.g., Java) implementation can put the linked list nodes anywhere, whereas array is typically contiguous memory Arrays benefit more from spatial locality Note: “array” doesn’t mean “good” Sufficiently large array won’t fit in one block Binary heaps “make big jumps” to percolate (different block)BSTs?9 Since looking things up in balanced binary search trees is O(log n), even for n = 239 (512GB) we don’t have to worry about minutes or hours Still, number of disk accesses matters AVL tree could have height of, say, 55 Which, based on our proof, is a lot of nodes Most of the nodes will be on disk: the tree is shallow, but it is still many gigabytes big so the tree cannot fit in memory Even if memory holds the first 25 nodes on our path, we still need 30 disk accessesNote about numbers; moral10 All the numbers in this lecture are “ballpark” “back of the envelope” figures Even if they are off by, say, a factor of 5, the moral is the same: If your data structure is mostly on disk, you want to minimize disk accesses A better data structure in this setting would exploit the block size to avoid disk

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 10 pages.

UW CSE 332 - Memory Hierarchy

Sign up for free to view:

Please select your school