DOC PREVIEW
MIT 6 826 - Performance

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

6.826—Principles of Computer Systems 2006 Handout 10. Performance 1 10. Performance Overview This is not a course about performance analysis or about writing efficient programs, although it often touches on these topics. Both are much too large to be covered, even superficially, in a single lecture devoted to performance. There are many books on performance analysis1 and a few on efficient programs2. Our goal in this handout is more modest: to explain how to take a system apart and understand its performance well enough for most practical purposes. The analysis is necessarily rather rough and ready, but nearly always a rough analysis is adequate, often it’s the best you can do, and certainly it’s much better than what you usually see, which is no analysis at all. Note that performance analysis is not the same as performance measurement, which is more common. What is performance? The critical measures are bandwidth and latency. We neglect other aspects that are sometimes important: availability (discussed later when we deal with replication), connectivity (discussed later when we deal with switched networks), and storage capacity When should you work on performance? When it’s needed. Time spent speeding up parts of a program that are fast enough is time wasted, at least from any practical point of view. Also, the march of technology, also known as Moore’s law, means that in 18 months from March 2006 a computer will cost the same but be twice as fast3 and have twice as much RAM and four times as much disk storage; in five years it will be ten times as fast and have 100 times as much disk storage. So it doesn’t help to make your system twice as fast if it takes two years to do it; it’s better to just wait. Of course it still might pay if you get the improvement on new machines as well, or if a 4 x speedup is needed. How can you get performance? There are techniques for making things faster: better algorithms, fast paths for common cases, and concurrency. And there is methodology for figuring out where the time is going: analyze and measure the system to find the bottlenecks and the critical parameters that determine its performance, and keep doing so both as you improve it and when it’s in service. As a rule, a rough back-of-the-envelope analysis is all you need. Putting in a lot of detail will be a lot of work, take a lot of time, and obscure the important points. 1 Try R. Jain, The Art of Computer Systems Performance Analysis, Wiley, 1991, 720 pp. 2 The best one I know is J. Bentley, Writing Efficient Programs, Prentice-Hall, 1982, 170 pp. 3 A new phenomenon as of 2006 is that the extra speed is likely to come mostly in the form of concurrency, that is, several processors on the chip, rather than a single processor that is twice as fast. This is because the improvements in internal processor architecture that have made it possible to use internal concurrency to speed up a processor that still behaves as though it is executing instructions sequentially are nearly played out. 6.826—Principles of Computer Systems 2006 Handout 10. Performance 2 What is performance: bandwidth and latency Bandwidth and latency are usually the important metrics. Bandwidth tells you how much work gets done per second (or per year), and latency tells you how long something takes from start to finish: to send a message, process a transaction, or referee a paper. In some contexts it’s customary to call these things by different names: throughput and response time, or capacity and delay. The ideas are exactly the same. Here are some examples of communication bandwidth and latency on a single link. Note that all the numbers are in bytes/sec; it’s traditional to quote bandwidths for some interconnects in bits/sec, so be wary of numbers you read. Medium Link Bandwidth Latency Width Pentium 4 chip on-chip bus 30 GB/s .4 ns 64 PC board Rambus bus 1.6 GB/s 75 ns 16 PCI I/O bus 533 MB/s 200 ns 32 Wires Serial ATA (SATA) 300 MB/s 200 ns 1 SCSI 40 MB/s 500 ns 32 LAN Gigabit Ethernet 125 MB/s 100 + µs 1 Fast Ethernet 12.5 MB/s 100 + µs 1 Ethernet 1.25 MB/s 100 + µs 1 Here are examples of communication bandwidth and latency through a switch that interconnects multiple links. Medium Switch Bandwidth Latency Links Pentium 4 chip register file 180 GB/s .4 ns 6 Wires Cray T3E 122 GB/s 1 µs 2K LAN Ethernet switch 4 GB/s 4–100 µs 32 Copper pair Central office 80 MB/s 125 µs 50K Finally, here are some examples of other kinds of work, different from simple communication. Medium Bandwidth Latency Disk 40 MB/s 10 ms RPC on Giganet with VIA 30 calls/ms 30 µs RPC 3 calls/ms 1 ms Airline reservation transactions 10000 trans/s 1 sec Published papers 20 papers/yr 2 years Specs for performance How can we put performance into our specs? In other words, how can we specify the amount of real time or other resources that an operation consumes? For resources like disk space that are controlled by the system, it’s quite easy. Add a variable spaceInUse that records the amount of disk space in use, and to specify that an operation consumes no more than max space, write << VAR used: Space | used <= max => spaceInUse := spaceInUse + used >>6.826—Principles of Computer Systems 2006 Handout 10. Performance 3 This is usually what you want, rather than saying exactly how much space is consumed, which would restrict the code too much. Doing the same thing for real time is a bit trickier, since we don’t usually think of the advance of real time as being under the control of the system. The spec, however, has to put a limit on how much time can pass before an operation is complete. Suppose we have a procedure P. We can specify TimedP that takes no more than maxPLatency to complete as follows. The variable now records the current time, and deadlines records a set of latest completion times for operations in progress. The thread Clock advances now, but not past a deadline. An operation like TimedP sets a deadline before it starts to run and clears it when it is done. VAR now : Time deadlines: SET Time THREAD Clock() = DO now < deadlines.min => now + := 1 [] SKIP OD PROC TimedP() = VAR t : Time << now < t /\ t < now + maxPLatency /\ ~ t IN deadlines => deadlines := deadlines + {t} >>; P(); << deadlines := deadlines - {t}; RET >> This may seem like an odd way of doing things, but it does allow exactly the


View Full Document

MIT 6 826 - Performance

Documents in this Course
Consensus

Consensus

10 pages

Load more
Download Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?