Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226Analysis of Algorithms2Running TimeCharles Babbage (1864)As soon as an Analytic Engine exists, it will necessarilyguide the future course of the science. Whenever anyresult is sought by its aid, the question will arise - By whatcourse of calculation can these results be arrived at by themachine in the shortest time? - Charles BabbageAnalytic Engine (schematic)3OverviewAnalysis of algorithms: framework for comparing algorithms andpredicting performance.Scientific method.! Observe some feature of the universe.! Hypothesize a model that is consistent with observation.! Predict events using the hypothesis.! Verify the predictions by making further observations.! Validate the theory by repeating the previous steps until thehypothesis agrees with the observations.Universe = computer itself.4Case Study: SortingSorting problem:! Given N items, rearrange them in ascending order.! Applications: statistics, databases, data compression, computationalbiology, computer graphics, scientific computing, ...HanleynameHaskellHauserHayesHongHornetHsuHausernameHongHsuHayesHaskellHanleyHornet5Insertion sort.! Brute-force sorting solution.! Move left-to-right through array.! Exchange next element with larger elements to its left, one-by-one.Insertion Sortpublic static void insertionSort(double[] a) { int N = a.length; for (int i = 0; i < N; i++) { for (int j = i; j > 0; j--) { if (less(a[j], a[j-1])) exch(a, j, j-1); else break; } }}6Insertion Sort: ObservationObserve and tabulate running time for various values of N.! Data source: N random numbers between 0 and 1.400 million40,00099 million20,00025 million10,0006.2 million5,000ComparisonsN16 million80,0007Data analysis. Plot # comparisons vs. input size on log-log scale.Regression. Fit line through data points ! a Nb.Hypothesis. # comparisons grows quadratically with input size ! N2/4.Insertion Sort: Experimental Hypothesisslope8Insertion Sort: Prediction and VerificationExperimental hypothesis. # comparisons ! N2/4.Prediction. 400 million comparisons for N = 40,000.Observations.Prediction. 10 billion comparisons for N = 200,000.Observation.9.997 billion200,000ComparisonsN399.7 million40,000401.6 million40,000400.0 million40,000ComparisonsN401.3 million40,000Agrees.Agrees.9Insertion Sort: Theoretical HypothesisExperimental hypothesis.! Measure running times, plot, and fit curve.! Model useful for predicting, but not for explaining.Theoretical hypothesis.! Analyze algorithm to estimate # comparisons as a function of:– number of elements N to sort– average or worst case input! Model useful for predicting and explaining.! Model is independent of a particular machine or compiler.Difference. Theoretical model can apply to machines not yet built.10Insertion Sort: Theoretical HypothesisWorst case. (descending)! Iteration i requires i comparisons.! Total = 0 + 1 + 2 + . . . + N-2 + N-1 = N (N-1) / 2.Average case. (random)! Iteration i requires i/2 comparisons on average.! Total = 0 + 1/2 + 2/2 + . . . + (N-1)/2 = N (N-1) / 4.E F G H I J D C B AA C D F H J E B I Gii11Insertion Sort: Theoretical HypothesisTheoretical hypothesis.Validation. Theory agrees with observations.RandomDescendingAscendingInputAverageWorstBestAnalysisN2 / 4N2 / 2NComparisons1/6 N3/2--Stddev12Insertion Sort: ObservationObserve and tabulate running time for various values of N.! Data source: N random numbers between 0 and 1.! Machine: Apple G5 1.8GHz with 1.5GB memory running OS X.5.6 seconds400 million40,0001.5 seconds99 million20,0000.43 seconds25 million10,0000.13 seconds6.2 million5,00023 secondsTimeComparisonsN16 million80,000145 seconds10 billion200,00013Data analysis. Plot time vs. input size on log-log scale.Regression. Fit line through data points ! a Nb.Hypothesis. Running time grows quadratically with input size.Insertion Sort: Experimental Hypothesis14Timing in JavaWall clock. Measure time between beginning and end of computation.! Manual: Skagen wristwatch.! Automatic: Stopwatch.java library.Stopwatch.tic();. . .double elapsed = StopWatch.toc();public class Stopwatch { private static long start; public static void tic() { start = System.currentTimeMillis(); } public static double toc() { long stop = System.currentTimeMillis(); return (stop - start) / 1000.0; }}15Measuring Running TimeFactors that affect running time.! Machine.! Compiler.! Algorithm.! Input data.More factors.! Caching.! Garbage collection.! Just-in-time compilation.! CPU used by other processes.Bottom line. Often hard to get precise measurements.16SummaryAnalysis of algorithms: framework for comparing algorithms andpredicting performance.Scientific method.! Observe some feature of the universe.! Hypothesize a model that is consistent with observation.! Predict events using the hypothesis.! Verify the predictions by making further observations.! Validate the theory by repeating the previous steps until thehypothesis agrees with the observations.Remaining question. How to formulate a hypothesis?Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226How To Formulate a Hypothesis18Types of HypothesesWorst case running time. Obtain bound on largest possible running timeof algorithm on input of a given size N.! Generally captures efficiency in practice.! Draconian view, but hard to find effective alternative.Average case running time. Obtain bound on running time of algorithmon random input as a function of input size N.! Hard to accurately model real instances by random distributions.! Algorithm tuned for a certain distribution may perform poorly onother inputs.Amortized running time. Obtain bound on running time of sequence ofN operations as a function of the number of operations.19Estimating the Running TimeTotal running time: sum of cost " frequency for all of the basic ops.! Cost depends on machine, compiler.! Frequency depends on algorithm, input.Cost for sorting.! A = # exchanges.! B = # comparisons.! Cost on a typical machine = 11A + 4B.Frequency of sorting ops.! N = # elements to sort.! Selection sort: A = N-1, B = N(N-1)/2.Donald Knuth1974 Turing Award20An easier alternative.(i) Analyze asymptotic growth as a function of input size N.(ii) For medium N, run and measure time.(iii) For large N, use (i) and (ii) to predict time.Asymptotic growth rates.! Estimate as a function of input
View Full Document