DOC PREVIEW
Pitt CS 0447 - Assessing and Understanding Performance

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS/COE0447 Computer Organization & Assembly LanguageProgram PerformanceClock, Clock Cycle TimeSlide 4Example 1From WikiPediaAmdahl’s Law (cont)Example 2AnswerWhy Performance Evaluation?Defining PerformanceResponse Time vs. ThroughputSome DefinitionsSlide 14Regarding TimeClockMeasuring TimeMeasuring Time w/ ClocksMeasuring Time w/ Clocks, cont’dWorkloadBenchmarksSummarizing PerformanceSummarizing Performance, cont’dSPEC BenchmarkAmdahl’s Law (in terms of time)Amdahl’s Law - exampleFallacies and PitfallsTo Summarize…To Summarize…, cont’dSlide 301CS/COE0447Computer Organization & Assembly LanguageCHAPTER 4Assessing and Understanding Performance2Program Performance•Program performance is measured in terms of time!•Program execution time deals with–Number of instructions executed to complete a job–How many clock cycles are needed to execute a single instruction–The length of the clock cycle (clock cycle time)3Clock, Clock Cycle Time•Circuits in computers are “clocked”•At each clock rising (or falling) edge, some specified actions are done, usually within the next rising (or falling) edge•Instructions typically require more than one cycle to executeFunction block(made of circuits)clockclock cycle time4Program Performance•time = (# of clock cycles)  (clock cycle time)•# of clock cycles = (# of instructions executed)  (average cycles per instruction) •time = (# of instructions executed)  (average clock cycles per instruction)  (clock cycle time)•time = cycle x s cycle•cycle = instruction x cycle (ave) SO: instruction•time (s) = instruction x cycle (ave) x s instruction cycle5Example 1•You have a machine with a CPU running at 1GHz. The same company releases its 2GHz CPU with 100% compatibility with the existing 1GHz CPU, and you are considering upgrading. What is the expected performance improvement from doing so? Assume that programs have 40% memory-access instructions, and each memory access takes 10ns on average. All other instructions take exactly one cycle for execution. Answer: in class6From WikiPedia•Amdahl's law, named after computer architect Gene Amdahl, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel computing to predict the theoretical maximum speedup using multiple processors.7Amdahl’s Law (cont)•The law is concerned with the speedup achievable from an improvement to a computation that affects a proportion P of that computation where the improvement has a speedup of S. •Amdahl's law states that the overall speedup of applying the improvement will be: 1((1-P) + P/S)•Our example: P = .6 and S = 2•1/((1-.6) + (.6/2)) = 1.43•This is the maximum speedup possible8Example 2•If a computer issues 30 network requests per second and each request is on average 64 KB, will a 100 Mbit Ethernet link be sufficient? (printer, accessing files, …)•KB = 10^3 bytes•Byte = 8 bits•Mbit = 10^6•A 100 Mbit Ethernet: 10^8 bit/s “bitrate”9Answer•Ethernet: 10^8 bit/s•KB = Kilobyte; Kilo = 10^3; byte = 8 bits•30 request/s * 64 KB/request * 10^3 x 8 bit/KB(the units cancel to leave bit/s)•30 * 64 * 8 * 10^3 = 3 * 6.4 * 8 * 10^5 < 10^8 (or use a calculator to compute it exactly)So, yes, it is sufficient10Why Performance Evaluation?DESIGNEVALUATION11Defining Performance•What do you mean when you say a computer has better performance than another?•We need a “metric” for comparison–One metric may not fully characterize a system•a number of metrics may be relevant–Important metrics for computer systems•Response time (a.k.a. execution time)•Throughput12Response Time vs. Throughput•Which has higher performance?–Time to deliver 1 passenger–Time to deliver 400 passengers•Time for 1 job is called–Response time or execution time•Jobs per day is called–Throughput or bandwidthPlane DC to Paris Top Speed PassengersThroughput(pmph)Boeing 747 6.5 hours 610 mph 470 286,700BAD/Sud Concorde3 hours 1350 mph 132 178,20013Some Definitions•Throughput is in units of things per second–Bigger is better•If we are primarily concerned with response time–Performance = 1 / execution time–Bigger is better  shorter execution time•“Machine A is N times faster than B” –= performance (A) / performance (B) = execution time (B) / execution time (A)14Response Time vs. Throughput•Time of Concorde vs. Boeing 747?–Concord is (6.5 hours/3 hours) faster–2.2 times faster•Throughput of Boeing 747 vs. Concorde–286,700 pmph / 178,200 pmph–1.6 times higher •Boeing 747 is 1.6 times (or 60%) higher in terms of throughput•Concorde is 2.2 times (or 120%) faster in terms of flying time (response time)•We will focus primarily on execution time for a single job for the remaining discussions15Regarding Time•Straightforward definition of time–Total time to complete a task, including disk accesses, memory accesses, other I/O activities, operating system overheads, …–Terms for this: “Real time”, “response time”, “elapsed time”•Alternative: time spent by CPU only on your program (since multiple processes may run at the same time)–“CPU execution time” or “CPU time”–Often divided into system CPU time (OS) and user CPU time (user program)16Clock17Measuring Time•In terms of seconds•CPU time: computers are constructed using digital circuitry running at a “clock”–Constant rate–Determines when events take place•Clock cycle time = length of a clock or clock period = 1 / clock rate–1ns if 1GHz clock–0.5ns if 2GHz clock–0.25ns if 4GHz clock18Measuring Time w/ Clocks•CPU execution time for program–Clock cycles for a program  clock cycle time–Clock cycles for a program / clock rate19Measuring Time w/ Clocks, cont’d•Total clock cycles for a program–Instructions for a program (=instruction count)  average clock cycles per instruction CPI•Time=(# of instr.)CPI(clock cycle time) •Looking at the units:–s = inst * cycle/inst * s/cycle20Workload•A set of programs run on a computer is a workload–Actual collection of applications–Synthetic programs (for experimentation)•To evaluate two computer systems, a user would simply compare the execution time of the workload on the two computers21Benchmarks•A set of applications relevant for performance


View Full Document

Pitt CS 0447 - Assessing and Understanding Performance

Download Assessing and Understanding Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Assessing and Understanding Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Assessing and Understanding Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?