'&$%CSE 303:Concepts and Tools for Software DevelopmentHal PerkinsSpring 2008Lecture 28— Profiling (gprof)CSE303 Spring 2008, Lecture 28 1'&$%ProfilersA profiler monitors and reports (performance) information about aprogram execution.They are useful for “debugging correct programs” by learning whereprograms consume m ost t ime and/or space.“90/10 rule of programs” (and often worse for new programs) – aprofiler helps you “find the 10”.But: The tool can be misused and misleading.CSE303 Spring 2008, Lecture 28 2'&$%What profilers tell youDifferent profilers profile different things.gprof, a profiler for code produced by gcc is widely available andpretty typical:• Call counts: # of times each function a calls each function b– And the simpler fact: # of times a was called• Time samples: # of times the program was executing a when“the profiler woke up to check where the program was”.Neither is quite what you want (as we’ll see later), but they’resemi-easy and semi-quick to do:• Call counts: Add code to every function call to update a tableindexed by function pairs.• Time samples: Use the processor’s timer; wake up and see wherethe program is.CSE303 Spring 2008, Lecture 28 3'&$%Using gprof• Compile with -pg on the right.– When you create the .o (for call counts)– When you create the executable (for tim e samples)• Run the program (creates (overwrites ) gmon.out)• Run gprof (on executable and gmon.out) to get human-readableresults.• Read the results (takes a little getting used to).CSE303 Spring 2008, Lecture 28 4'&$%Getting useful info• The information depends on your inputs! (Always know whatyou’re profiling)• Statistical sampling requires a reasonable number of samples– Probably want at very least a few thousand– Can run a program over and over and use gprof -s (learn onyour own; write a shell-script)• Make sure performance matte rs– Is 10% faster worth uglier or buggier code?– Do you have better things to do (documentation, testing, ...)?CSE303 Spring 2008, Lecture 28 5'&$%Performance tuning• Never tune until you know the bottle neck (that’s what gprof isfor, but it doesn’t tell you how to tune).• Rarely overtune to som e inputs at the expense of others.• Always focus on the overall algorithm first.• Think doubly-hard about making non-modular changes.• Focus on low -leve l tricks only if you really need to (< 5 times inyour career?)• See if compiler flags (e.g., -O) are enough.Note: Performance tuning a library is harder be cause you want to dowell for “unknown programs and inputs”.CSE303 Spring 2008, Lecture 28 6'&$%Misleading Fact #1Cumulative times are based on call estimation. They can be really,really wrong, but usually aren’t.int g = 0;void c(int i) {if(i) return;for(; i < 100000000; ++i)++g;}void a() { c(0); }void b() { c(1); }int main(int argc,char**argv) { a(); b(); return 0; }Conclusion: You must understand what your profiler measures andwhat it presents to you. gprof doesn’t lie (if you re ad the manual)CSE303 Spring 2008, Lecture 28 7'&$%Misleading Fact #2Sampling errors (for time samples ) can be caused by too few samples,or by periodic samplingvoid a() { /* takes 0.09 s */ }void b() { /* takes 0.01 s */ }int main(int argc,char**argv) {for(; i < 10000; ++i) {a();b();}}This probably doesn’t happen much and better profilers can userandom intervals to avoid it.Related fact: Measurement code changes timing (an uncertaintyprinciple).CSE303 Spring 2008, Lecture 28 8'&$%Poor man’s profilingThe time command is more useful because no measurement overhead,but less useful because you get only whole-program numbe rs.• real: roughly “wall-clock”• user: time spent running the code in the program• system: time the O/S spent doing things on behalf of the programNot precise for small numbersMisleading Fact #3: gprof doe s not measure system time?Effects on real time: Machine load, disk access, I/OEffects on system time: I/O to screen, file, or /dev/nullCSE303 Spring 2008, Lecture 28 9'&$%Compiler OptimizationCompilers must:• Trade “com pile-time ” for “code-quality”• Trade “amount of code” for “specialization of code”• Make guesse s about how code will be used.You c an affect the trade-off via “optimization flags” – definitely easierbut less predictable than modifying your code.gcc is not a great optimizer:• No promises; it could slow your program down (unlikely, but it canhappen)Bottom line: Remember to “turn optimizations on” if it matters.CSE303 Spring 2008, Lecture 28 10'&$%Final Words of Wisdom“Premature optimization is the root of all evil”Donald K nuthCSE303 Spring 2008, Lecture 28
View Full Document