DOC PREVIEW
Stanford CS 106B - Sorting and Efficiency

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Eric Roberts Handout #21CS 106B April 20, 2009Sorting and EfficiencySorting• Of all the algorithmic problems that computer scientists havestudied, the one with the broadest practical impact is certainlythe sorting problem, which is the problem of arranging theelements of an array or a vector in order.• The sorting problem comes up, for example, in alphabetizinga telephone directory, arranging library records by cataloguenumber, and organizing a bulk mailing by ZIP code.• There are many algorithms that one can use to sort an array.Because these algorithms vary enormously in their efficiency,it is critical to choose a good algorithm, particularly if theapplication needs to work with large arrays.The Selection Sort Algorithm• Of the many sorting algorithms, the easiest one to describe isselection sort, which appears in the text like this:void Sort(Vector<int> vec) { int n = vec.size(); for (int lh = 0; lh < n; lh++) { int rh = lh; for (int i = lh + 1; i < n; i++) { if (vec[i] < vec[rh]) rh = i; } int temp = vec[lh]; vec[lh] = vec[rh]; vec[rh] = temp; }}• Coding this algorithm as a single function makes sense forefficiency but complicates the analysis. The next two slidesdecompose selection sort into a set of functions that make theoperation easier to follow./* * Function: Sort * -------------- * Sorts a Vector<int> into increasing order. This implementation * uses an algorithm called selection sort, which can be described * in English as follows. With your left hand (lh), point at each * element in the vector in turn, starting at index 0. At each * step in the cycle: * * 1. Find the smallest element in the range between your left * hand and the end of the vector, and point at that element * with your right hand (rh). * * 2. Move that element into its correct position by swapping * the elements indicated by your left and right hands. */void Sort(Vector<int> & vec) { for ( int lh = 0 ; lh < vec.size() ; lh++ ) { int rh = FindSmallest(vec, lh, vec.size() - 1); Swap(vec[lh], vec[rh]); }}Decomposition of the Sort Function/* * Function: FindSmallest * ---------------------- * Returns the index of the smallest value in the vector between * index positions p1 and p2, inclusive. */int FindSmallest(Vector<int> & vec, int p1, int p2) { int smallestIndex = p1; for ( int i = p1 + 1 ; i <= p2 ; i++ ) { if (vec[i] < vec[smallestIndex]) smallestIndex = i; } return smallestIndex;}/* * Function: Swap * -------------- * Exchanges two integer values passed by reference. */void Swap(int & x, int & y) { int temp = x; x = y; y = temp;}Decomposition of the Sort FunctionSimulating Selection Sort0123456789809 503 946 367 987 838 259 236 659 361int main() { Vector<int> vec = CreateTestVector(); Sort(vec); return 0;}vecvoid Sort(Vector<int> & vec) { for ( int lh = 0 ; lh < vec.size() ; lh++ ) { int rh = FindSmallest(vec, lh, vec.size() - 1); Swap(vec[lh], vec[rh]); }}rh veclhEfficiency of Selection Sort• The primary question for today is how one might evaluate theefficiency of an algorithm such as selection sort.• One strategy is to measure the actual time it takes to run forarrays of different sizes. In C++, you can measure elapsedtime by calling the time function, which returns the currenttime in milliseconds. Using this strategy, however, requiressome care:–The time function is often too rough for accurate measurement.It therefore makes sense to measure several runs together andthen divide the total time by the number of repetitions.– Most algorithms show some variability depending on the data.To avoid distortion, you should run several independent trialswith different data and average the results.– Some measurements are likely to be wildly off because thecomputer needs to run some background task. Such data pointsmust be discarded as you work through the analysis.– 2 –Measuring Sort Timings.0021 .0025 .0022 .0026 .0020 .0030 .0022 .0023 .0022 .0025.006 .007 .008 .007 .007 .011 .007 .007 .007 .007.014 .014 .014 .015 .014 .014 .014 .014 .014 .014.028 .024 .025 .026 .023 .025 .025 .026 .025 .027.039 .037 .036 .041 .042 .039 .140 .039 .034 .038.187 .152 .168 .176 .146 .146 .165 .146 .178 .1543.94 3.63 4.06 3.76 4.11 3.51 3.48 3.64 3.31 3.4513.40 12.90 13.80 17.60 12.90 14.10 12.70 81.60 16.00 15.50322.5 355.9 391.7 321.6 388.3 321.3 321.3 398.7 322.1 321.31319. 1388. 1327. 1318. 1331. 1336. 1318. 1335. 1325. 1319.N = 10203040501005001000500010000Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Trial 6 Trial 7 Trial 8 Trial 9 Trial 10.0024.007.014.025.049.1623.6921.05346.41332.μ.00029.00139.00013.0014.0323.01510.27221.3333.8320.96.011.14081.601388..03914.321326..00036.00251.697.50The following table shows the average timing of the selectionsort algorithm after removing outlying trials that differ by morethan two standard deviations from the mean. The column labeledμ (the Greek letter mu, which is the standard statistical symbolfor the mean) is a reasonably good estimate of running time.Selection Sort Running Times• Many algorithms that operate on vectors have running timesthat are proportional to the size of the array. If you multiplythe number of values by ten, you would expect thosealgorithms to take ten times as long.10 100 1000 10000 .00240.16214.321332.N time• As the running times on the preceding slidemake clear, the situation for selection sort isvery different. The table on the right showsthe average running time when selection sort isapplied to 10, 100, 1000, and 10000 values.• As a rough approximation—particularly as you work withlarger values of N—it appears that every ten-fold increase inthe size of the array means that selection sort takes about 100times as long.Counting Operations• Another way to estimate the running time is to count howmany operations are required to sort an array of size N.• In the selection sort implementation, the section of code thatis executed most frequently (and therefore contributes themost to the running time) is the body of the FindSmallestmethod. The number of operations involved in each call toFindSmallest changes as the algorithm proceeds:N values are considered on the first call to FindSmallest.N - 1 values are considered on the second call.N - 2 values are considered on the third call, and so on.• In mathematical notation, the number of values considered inFindSmallest can be expressed as a summation, which


View Full Document

Stanford CS 106B - Sorting and Efficiency

Download Sorting and Efficiency
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sorting and Efficiency and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sorting and Efficiency 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?