DOC PREVIEW
UMD CMSC 411 - Lecture 18 Storage Systems 2

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS252 S05CMSC 411Computer Systems ArchitectureLecture 18Storage Systems 2I/O performance measuresCMSC 41 1 - 18 (so me from Patterso n, Suss man, others )3I/O performance measures• diversity: which I/O devices can connect to the system?• capacity: how many I/O devices can connect to the system?• bandwidth: throughput, or how much data can be moved per unit time• latency: response time, the interval between a request and its completion• High throughput usually means slow response time!CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )4Throughput vs. latencyFig. 6.8Fig. 6.9CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )5Improving performance (cont.)• Adding another server can decrease response time, if workload is held constant– but keeping work balanced between servers is difficult• To design a responsive system, must understand what the “typical” user wants to do with it• Each transaction consists of three parts:– entry time: the time for the user to make the request– system response time: the latency– think time: the time between system response and the next entry• Key observation is that a faster system produces a lower think time – see Fig. 6.10CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )6Modeling computer performance• The usual way to model computer performance is using queuing theory (mathematics again)• Unfortunately, even queuing theory does not provide a very good model, so more complicated mathematics is now being applied (e.g., stochastic differential equations)• But, H&P only consider queuing models– and we don’t even have time to go into that now (maybe later)CS252 S05Data Management IssuesCMSC 41 1 - 18 (so me from Patterso n, Suss man, others )8Data Management Issues• Two concerns we’ll talk about:– stale data– DMA design• And the book has short discussions of several more, including– asynchronous I/O through the OS– file systems – server manages blocks and maintains metadata vs. disks doing it and server uses file system protocol, such as NFS (also called NAS – network attached storage)CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )9Stale Data• May have copies of data in – cache– memory– disk• Need to make sure that always use the most recent version– for use in the CPU– for output• Two approaches to the problem, both having disadvantagesCMSC 41 1 - 18 (so me from Patterso n, Suss man, others )10Stale Data (cont.)• Approach 1: Attach the I/O bus to the cache• Advantage: No problem of stale data, since CPU and I/O devices all see the copy in the cache• Disadvantage:– All I/O data must go through the cache, even if the CPU doesn't need it, so performance is reduced– CPU and I/O bus must take turns accessing the cache, so arbitration hardware requiredCMSC 41 1 - 18 (so me from Patterso n, Suss man, others )11Stale Data (cont.)• Approach 2: Attach the I/O bus to the memoryFig. 7.15 of H&P 3ed.CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )12Stale Data (cont.)• Advantage: I/O does not slow down the CPU• Disadvantage:– The I/O system may see stale data, unless we do write-through– CPU might see stale data if the I/O system modifies memory after the cache copied it• Extra hardware is required to check whether I/O data is currently held in cacheCS252 S05CMSC 41 1 - 18 (so me from Patterso n, Suss man, others )13DMA design• Direct memory access hardware needs to use either– virtual addresses– or physical addresses• Using physical addresses: – If the data is longer than a page, then several addresses need to be passed– The data may be relocated by the operating system, changing the physical address• Virtual addresses gives a cleaner designDesigning an I/O SystemCMSC 41 1 - 19 (so me from Patterso n, Suss man, others )15Designing an I/O System• Price, performance, and capacity issues• Need to choose– which I/O devices to connect– how to connect them• Example: The CPU is seldom the limiting factor for I/O performance• Suppose the CPU can handle 10,000 I/O operations per second (IOPS)• And suppose the average I/O size is 16 KBCMSC 41 1 - 19 (so me from Patterso n, Suss man, others )16I/O Systems• The other links in the I/O chain are:– the I/O controller - suppose it adds 1 ms overhead per I/O operation– the I/O bus - suppose it is a bus that can transfer 20 MB/sec = 20 KB/ms– the disk - suppose it rotates at 7200 RPM, with 8 ms average seek time and 6 MB/sec transfer rateCMSC 41 1 - 19 (so me from Patterso n, Suss man, others )17I/O System Performance• Consider the disk time first:– 7200 RPM = 7200/(60*103) = .12 revolutions per ms– 6 MB/sec = 6 KB/ms– So the average disk time is seek + rotational latency + transfer =8 ms + .5 / .12 ms + 16 / 6 = 14.9 ms• So the average time per transfer is– I/O controller time + bus time + disk time = 1 ms + 16 / 20 ms + 14.9 ms = 16.7 ms• So with one controller, one bus, and one disk, can do at most– 1/(16.7*10-3) = 60 IOPS• If this is not good enough, should analyze to see whether it is better to add more controllers, more buses, or more disks• Another, more complex, performance analysis in Section 6.7, for the Internet Archive Cluster12/1/2009(some from Patter son, Suss man, others)18Storage Example: Internet Archive• Goal of making a historical record of the Internet – Internet Archive began in 1996– Wayback Machine interface performs time travel to see what the website at a URL looked like in the past• It contains over a petabyte (1015bytes), and is growing by 20 terabytes (1012bytes) of new data per month• In addition to storing the historical record, the same hardware is used to crawl the Web every few months to get snapshots of the InternetCS252 S0512/1/2009(some from Patter son, Suss man, others)19Internet Archive Cluster• 1U storage node PetaBox GB2000 from Capricorn Technologies – Contains 4 500 GB Parallel ATA (PATA) disk drives, 512 MB of DDR266 DRAM, one 10/100/1000 Ethernet interface, and a 1 GHz C3 Processor from VIA (80x86).– Node dissipates ≈ 80 watts • 40 GB2000s in a standard VME rack,  80 TB of raw storage capacity • 40 nodes are connected with a 48-port 10/100 or 10/100/1000 Ethernet switch • 1 PetaByte = 12 racks12/1/2009(some from Patter son, Suss man, others)20Estimated Cost• VIA processor, 512 MB of DDR266 DRAM, ATA disk


View Full Document

UMD CMSC 411 - Lecture 18 Storage Systems 2

Documents in this Course
Load more
Download Lecture 18 Storage Systems 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 18 Storage Systems 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 18 Storage Systems 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?