DOC PREVIEW
UMD CMSC 411 - Lecture 3 Reliability and Performance

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CMSC 411Computer Systems ArchitectureyLecture 3Reliability and PerformanceReliability and Performance Alan [email protected]@csu deduAdministrivia• Homework problems for Unit 1 posted– due next Thursday, 2/12RdA diABiPilii•Read Appendix A –Basic Pipelining• Appendix B useful as a MIPS ISA reference– worth reading, but we’ll only touch on parts of it in lecture as neededCMSC 411 - 3 (from Patterson)2Outline• Moore’s Law and Technology Trends• Reliability and MTTFPf•Performance • MIPS – An ISA for Pipelining– 5 stage pipelining–Structural and Data HazardsStructural and Data Hazards– Forwarding– Branch Schemes– Exceptions and InterruptsCli–ConclusionCMSC 411 - 3 (from Patterson)3Moore’s Law: 2X transistors / “year”• “Cramming More Components onto Integrated Circuits”CMSC 411 - 3 (from Patterson)4– Gordon Moore, Electronics, 1965• # on transistors / cost-effective integrated circuit double every N months (12 ≤ N ≤ 24)Tracking Technology Performance Trends• Drill down into 4 technologies:– Disks, Memory–Memory, – Network, – Processors•Compare ~1980 Archaic (Nostalgic) vs•Compare ~1980 Archaic (Nostalgic) vs. ~2000 Modern (Newfangled)– Performance Milestones in each technology•Compare for Bandwidth vs Latency improvements•Compare for Bandwidth vs. Latency improvements in performance over time• Bandwidth: number of events per unit timeEMbit/d tkMb t/dfdik–E.g., Mbits/ second over network, Mbytes / second from disk• Latency: elapsed time for a single event– E.g., one-way network delay in microseconds, average disk access time in millisecondsCMSC 411 - 3 (from Patterson)5average disk access time in millisecondsDisks: Archaic(Nostalgic) v. Modern(Newfangled)• Seagate 373453, 2003• 15000 RPM (4X)• CDC Wren I, 1983• 3600 RPM• 73.4 GBytes (2500X)• Tracks/Inch: 64000 (80X)• 0.03 GBytes capacity• Tracks/Inch: 800• Bits/Inch: 533,000 (60X)• Four 2.5” platters (in 3 5”form factor)• Bits/Inch: 9550• Three 5.25” platters(in 3.5 form factor)• Bandwidth: 86 MBytes/sec (140X)• Bandwidth: 0.6 MBytes/sec• Latency: 5.7 ms (8X)• Cache: 8 MBytes• Latency: 48.3 ms• Cache: noneCMSC 411 - 3 (from Patterson)6Latency Lags Bandwidth (for last ~20 years)• Performance Milestones100001000ement 100BW ImproveDisk 10Relative B• Disk: 3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)(latency = simple operation w/o contention1110100(Latency improvement = Bandwidth improvement)CMSC 411 - 3 (from Patterson)7(latency = simple operation w/o contentionBW = best-case)110100Relative Latency Improvement Memory: Archaic (Nostalgic) v. Modern (Newfangled)• 1980 DRAM(asynchronous)006Mbi / hi• 2000 Double Data Rate Synchr. (clocked) DRAM256 00 Mbit / hi(4000X)•0.06 Mbits/chip• 64,000 xtors, 35 mm2•16bit data bus per•256.00 Mbits/chip(4000X)• 256,000,000 xtors, 204 mm2•64-bit data bus per•16-bit data bus per module, 16 pins/chip• 13 Mbytes/sec•64-bit data bus per DIMM, 66 pins/chip (4X)• 1600 Mbytes/sec (120X)• Latency: 225 ns• (no block transfer)• Latency: 52 ns (4X)• Block transfers (page mode)CMSC 411 - 3 (from Patterson)8Latency Lags Bandwidth (last ~20 years)• Performance Milestones100001000ent • Memory Module: 16bit plain DRAM Page Mode DRAM 32b100ImprovemMemoryDisk DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x)Di k3600 5400 7200 1000010elative BW •Disk:3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)(latency = simple operation w/o contention1Re(Latency improvement = Bandwidth improvement)CMSC 411 - 3 (from Patterson)9(yppBW = best-case)1 10 100Relative Latency Improvement LANs: Archaic (Nostalgic)v. Modern (Newfangled)• Ethernet 802.3• Year of Standard: 1978• Ethernet 802.3ae• Year of Standard: 2003• 10 Mbits/slink speed •Latency: 3000sec• 10,000 Mbits/s (1000X)link speed •Latency: 190sec(15X)•Latency: 3000 μsec• Shared media•Coaxial cable•Latency: 190 μsec(15X)• Switched media•Category 5 copper wireCoaxial cableCategory 5 copper wireCoaxial Cable:Braided outer conductorPlastic CoveringTwisted Pair:"Cat 5" is 4 twisted pairs in bundleCopper coreInsulatorCopper, 1mm thick, twisted to avoid antenna effectCMSC 411 - 3 (from Patterson)10Latency Lags Bandwidth (last ~20 years)• Performance Milestones100010000•Ethernet: 10Mb, 100Mb,1001000ement MNetworkEthernet: 10Mb, 100Mb, 1000Mb, 10000 Mb/s (16x,1000x)• Memory Module: 16bit plain DRAM Page Mode DRAM 32b10100W ImproveMemoryDisk DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x)1Relative BW• Disk: 3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)0.11 10 100(Latency improvement = Bandwidth improvement)CMSC 411 - 3 (from Patterson)11(latency = simple operation w/o contentionBW = best-case)Relative Latency Improvement CPUs: Archaic (Nostalgic) v. Modern (Newfangled)• 1982 Intel 80286 • 12.5 MHz• 2001 Intel Pentium 4• 1500 MHz(120X)• 2 MIPS (peak)• Latency 320 ns134 000 t 472• 4500 MIPS (peak) (2250X)• Latency 15 ns (20X)42 000 000t2172•134,000 xtors, 47 mm2• 16-bit data bus, 68 pins•Microcode interpreter•42,000,000 xtors, 217 mm2• 64-bit data bus, 423 pins•3-way superscalar•Microcode interpreter, separate FPU chip• (no caches)3-way superscalar,Dynamic translate to RISC, Superpipelined (22 stage),Out-of-Order executionOut-of-Order execution• On-chip 8KB Data caches, 96KB Instr. Trace cache, 256KB L2 hCMSC 411 - 3 (from Patterson)12256KB L2 cacheLatency Lags Bandwidth (last ~20 years)• Performance Milestones• Processor: ‘286, ‘386, ‘486, Pi Pi P10000ProcessorCPU high, Memory lowPentium, Pentium Pro, Pentium 4 (21x,2250x)•Ethernet: 10Mb, 100Mb,1000NetworkMemory low(“Memory Wall”)Ethernet: 10Mb, 100Mb, 1000Mb, 10000 Mb/s (16x,1000x)• Memory Module: 16bit plain DRAM Page Mode DRAM 32b100Relative BW ImprovementMemoryDisk DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x)10ment • Disk : 3600, 5400, 7200, 10000, 15000 RPM (8x, 143x)1(Latency improvement = Bandwidth improvement)CMSC 411 - 3 (from Patterson)13110100Relative Latency Improvement Rule of Thumb for Latency Lagging BW• In the time that bandwidth doubles, latency improves by no more than a factor of 1.2 to 1.4(and capacity improves faster than bandwidth)• Stated alternatively: Bandwidth improves by more than the squareBandwidth improves by more than the square of the improvement in LatencyCMSC 411 - 3 (from Patterson)146 Reasons


View Full Document

UMD CMSC 411 - Lecture 3 Reliability and Performance

Documents in this Course
Load more
Download Lecture 3 Reliability and Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 Reliability and Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 Reliability and Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?