MIT 6 826 - Lecture Notes - D2811890

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 826> Lecture Notes

MIT 6 826 - Lecture Notes

School name Massachusetts Institute of Technology

Course 6 826- Principles of Computer Systems

Pages 15

Download Save

Unformatted text preview:

6.826—Principles of Computer Systems 2002 Handout 10. Performance 1 10. Performance Overview This is not a course about performance analysis or about writing efficient programs, although it often touches on these topics. Both are much too large to be covered, even superficially, in a single lecture devoted to performance. There are many books on performance analysis1 and a few on efficient programs2. Our goal in this handout is more modest: to explain how to take a system apart and understand its performance well enough for most practical purposes. The analysis is necessarily rather rough and ready, but nearly always a rough analysis is adequate, often it’s the best you can do, and certainly it’s much better than what you usually see, which is no analysis at all. Note that performance analysis is not the same as performance measurement, which is more common. What is performance? The critical measures are bandwidth and latency. We neglect other aspects that are sometimes important: availability (discussed later when we deal with replication), connectivity (discussed later when we deal with switched networks), and storage capacity When should you work on performance? When it’s needed. Time spent speeding up parts of a program that are fast enough is time wasted, at least from any practical point of view. Also, the march of technology, also known as Moore’s law, means that in 18 months from March 2002 a computer will cost the same but be twice as fast and have twice as much RAM and four times as much disk storage; in five years it will be ten times as fast and have 100 times as much disk storage. So it doesn’t help to make your system twice as fast if it takes two years to do it; it’s better to just wait. Of course it still might pay if you get the improvement on new machines as well, and if a 4 x speedup is needed. How can you get performance? There are techniques for making things faster: better algorithms, fast paths for common cases, and concurrency. And there is methodology for figuring out where the time is going: analyze and measure the system to find the bottlenecks and the critical parameters that determine its performance, and keep doing so both as you improve it and when it’s in service. As a rule, a rough back-of-the-envelope analysis is all you need. Putting in a lot of detail will be a lot of work, take a lot of time, and obscure the important points. What is performance: bandwidth and latency Bandwidth and latency are usually the important metrics. Bandwidth tells you how much work gets done per second (or per year), and latency tells you how long something takes from start to finish: to send a message, process a transaction, or referee a paper. In some contexts it’s customary to call these things by different names: throughput and response time, or capacity and delay. The ideas are exactly the same. 1 Try R. Jain, The Art of Computer Systems Performance Analysis, Wiley, 1991, 720 pp. 2 The best one I know is J. Bentley, Writing Efficient Programs, Prentice-Hall, 1982, 170 pp. 6.826—Principles of Computer Systems 2002 Handout 10. Performance 2 Here are some examples of communication bandwidth and latency on a single link. Medium Link Bandwidth Latency Width Alpha EV7 chip on-chip bus 10 GB/s .8 ns 64 PC board PCI I/O bus 266 MB/s 250 ns 32 Wires Fibrechannel 125 MB/s 200 ns 1 SCSI 40 MB/s 500 ns 32 LAN Gigabit Ethernet 125 MB/s 100 + µs 1 Fast Ethernet 12.5 MB/s 100 + µs 1 Ethernet 1.25 MB/s 100 + µs 1 Here are examples of communication bandwidth and latency through a switch that interconnects multiple links. Medium Switch Bandwidth Latency Links Alpha chip register file 60 GB/s .8 ns 6 Wires Cray T3E 122 GB/s 1 µs 2K LAN ATM switch 10 GB/s 10 µs 52 Ethernet switch 40 MB/s 100–1200 µs 32 Copper pair Central office 80 MB/s 125 µs 50K Finally, here are some examples of other kinds of work, different from simple communication. Medium Bandwidth Latency Disk 40 MB/s 10 ms RPC on Giganet with VIA 30 calls/ms 30 µs RPC 3 calls/ms 1 ms Airline reservation transactions 10000 trans/s 1 sec Published papers 20 papers/yr 2 years Specs for performance How can we put performance into our specs? In other words, how can we specify the amount of real time or other resources that an operation consumes? For resources like disk space that are controlled by the system, it’s quite easy. Add a variable spaceInUse that records the amount of disk space in use, and to specify that an operation consumes no more than max space, write << VAR used: Space | used <= max => spaceInUse := spaceInUse + used >> This is usually what you want, rather than saying exactly how much space is consumed, which would restrict the code too much. Doing the same thing for real time is a bit trickier, since we don’t usually think of the advance of real time as being under the control of the system. The spec, however, has to put a limit on how much time can pass before an operation is complete. Suppose we have a procedure P. We can specify TimedP that takes no more than maxPLatency to complete as follows. The variable now6.826—Principles of Computer Systems 2002 Handout 10. Performance 3 records the current time, and deadlines records a set of latest completion times for operations in progress. The thread Clock advances now, but not past a deadline. An operation like TimedP sets a deadline before it starts to run and clears it when it is done. VAR now : Time deadlines: SET Time THREAD Clock() = DO now < deadlines.min => now + := 1 [] SKIP OD PROC TimedP() = VAR t : Time << now < t /\ t < now + maxPLatency /\ ~ t IN deadlines => deadlines := deadlines + {t} >>; P(); << deadlines := deadlines - {t}; RET >> This may seem like an odd way of doing things, but it does allow exactly the sequences of transitions that we want. The alternative is to construct P so that it completes within maxPLatency, but there’s no straightforward way to do this. Often we would like to write a probabilistic performance spec; for example, service time is drawn from a normal distribution with given mean and variance. There’s no way to do this directly in Spec, because the underlying model of non-deterministic state machines has no notion of probability. What we can do is to keep track of actual service times and declare a failure if they get too far from the desired form. Then you can interpret the spec to say:

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 826 - Lecture Notes

Sign up for free to view:

Please select your school