CMU CS 15213 - Performance Evaluation I - D3050685

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15213> Performance Evaluation I

DOC PREVIEW

CMU CS 15213 - Performance Evaluation I

School name Carnegie Mellon University

Course Cs 15213- Introduction to Computer Systems

Pages 27

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

15 213 Performance Evaluation I November 5 1998 Topics Performance measures metrics Timing Profiling class22 ppt Performance expressed as a time Absolute time measures metrics difference between start and finish of an operation synonyms running time elapsed time response time latency completion time execution time most straightforward performance measure Relative normalized time measures running time normalized to some reference time e g time reference time Guiding principle Choose performance measures that track running time class22 ppt 2 CS 213 F 98 Performance expressed as a rate Rates are performance measures expressed in units of work per unit time Examples millions of instructions sec MIPS millions of fp instructions sec Mflops sec Mflop 10 6 flops millions of bytes sec MBytes sec MByte 2 20 bytes millions of bits sec Mbits sec Mbit 10 6 bits images sec samples sec transactions sec TPS class22 ppt 3 CS 213 F 98 Performance expressed as a rate cont Key idea Report rates that track execution time Example Suppose we are measuring a program that convolves a stream of images from a video camera Bad performance measure MFLOPS number of floating point operations depends on the particular convolution algorithm n 2 matix vector product vs nlogn fast Fourier transform An FFT with a bad MFLOPS rate may run faster than a matrix vector product with a good MFLOPS rate Good performance measure images sec a program that runs faster will convolve more images per second class22 ppt 4 CS 213 F 98 Timing mechanisms Clocks returns elapsed time since epoch e g Jan 1 1970 Unix getclock command coarse grained e g us resolution on Alpha long int secs ns struct timespec start stop getclock TIMEOFDAY start P getclock TIMEOFDAY stop secs stop tv sec start tv sec ns stop tv nsec start tv nsec printf ld ns n secs 1e9 ns class22 ppt 5 CS 213 F 98 Timing mechanisms cont Interval count down timers set timer to some initial value timer counts down to zero then sends Unix signal course grained e g us resolution on Alphas void init etime first it value tv sec 86400 setitimer ITIMER VIRTUAL first NULL Using the interval timer class22 ppt double get etime struct itimerval curr getitimer ITIMER VIRTUAL curr return double first it value tv sec curr it value tv sec first it value tv usec curr it value tv usec 1e 6 init etime secs get etime P secs get etime secs printf lf secs n secs 6 CS 213 F 98 Timing mechanisms cont Performance counters counts system events CYCLES IMISS DMISS BRANCHMP very fine grained short time span e g 9 seconds on 450 MHz Alpha unsigned int counterRoutine Alpha cycle counter 0x601fc000u 0x401f0000u 0x6bfa8001u unsigned int counter void void counterRoutine Using the Alpha cycle counter class22 ppt cycles counter P cycles counter cycles printf d cycles n cycles 7 CS 213 F 98 Measurement Discretization errors pitfalls need to measure large enough chunks of work but how large is large enough Unexpected cache effects artificial hits or misses cold start misses due to context swapping class22 ppt 8 CS 213 F 98 The nature of time real wall clock time user time time executing instructing instructions in the user process system time time executing instructing instructions in kernel on behalf of user process real wall clock time We will use the word time to refer to user time class22 ppt 9 CS 213 F 98 Anatomy of a timer Tstart time T1 Tfinish program execution time dt T2 Tn Tk clock interrupt tick interval 2 timer period dt secs tick timer resolution 1 dt ticks sec Assume here that Tk Tk 1 dt class22 ppt 10 CS 213 F 98 Measurement pitfall 1 Discretization error actual program execution time Tstart time T1 dt T2 Tfinish Tn measured time Tn T1 actual time Tn T1 Tfinish Tn Tstart T1 absolute error measured time actual time fstart Tstart T1 dt fraction of interval overreported ffinish Tfinish Tn dt fraction of interval underreported absolute error dt fstart dt ffinish dt fstart ffinish maxclass22 ppt absolute error dt 11 CS 213 F 98 Examples of discretization error actual running time time Actual time near zero measured time dt Absolute measurement error dt class22 ppt 12 CS 213 F 98 Examples of discretization error cont actual running time time Actual time near 2dt measured time dt Absolute measurement error dt class22 ppt 13 CS 213 F 98 Estimating the timer period dt start 0 while start end get etime dt end start printf dt lf n dt Digital Unix Alpha systems dt 1ms class22 ppt 14 CS 213 F 98 Modeling discretization error Key idea need to measure long enough to hide the discretization error Example start get etime for i 0 i n i P tprime get etime start Question how big must tprime be in order to get a good estimate of the running time of the loop class22 ppt 15 CS 213 F 98 Relative error analysis Let t and t be the actual and measured running times of the loop respectively and let dt be the timer period Also let t t be the absolute error and let t t t be the relative error Problem What value of t will result in a relative error less than or equal to Emax Fact 1 t t dt Fact 2 t dt t We want t t t Emax dt t Emax dt Emax t dt Emax t dt dt class22 ppt Emax dt t 1 algebra 2 16 CS 213 F 98 Relative error analysis start get etime for i 0 i n i P tp get etime start t Example Let dt 1 ms and Emax 0 05 i e 5 relative error Then dt Emax dt t 001 05 05 t t 0 070 seconds 70 ms class22 ppt 17 CS 213 F 98 Measurement pitfall 2 Unexpected cache effects Call ordering can introduce unexpected cold start misses measured with Alpha cycle counter ip array1 array2 array3 68 829 cycles ipp array1 array2 array3 23 337 cycles ipp array1 array2 array3 70 513 cycles ip array1 array2 array3 23 203 cycles Context switches can alter cache miss rate 71 002 67 968 68 840 68 571 69 911 class22 ppt 23 617 23 384 23 365 23 492 23 692 ip ipp cycles on unloaded timing server 18 CS 213 F 98 Measurement summary It s difficult to get accurate times discretization error but can t always measure short procedures in loops global state mallocs changes cache behavior It s difficult to get repeatable times cache effects due to ordering and context switches Moral of the story Adopt a healthy skepticism about measurements Always subject measurements to sanity checks class22 ppt 19 CS 213 F 98 Profiling The goal of profiling is to account for the cycles used by a program or system Basic techniques src translation gprof Graham 1982 binary translation Atom DEC 1993 pixie MIPS 1990 direct simulation SimOS Rosenblum 1995 statistical sampling prof

View Full Document

CMU CS 15213 - Performance Evaluation I

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-25-26-27 out of 27 pages.

CMU CS 15213 - Performance Evaluation I

Sign up for free to view:

Please select your school