Unformatted text preview:

Lecture 7 Quicksort Frank Pfenning 15 122 Principles of Imperative Computation Spring 2024 In this lecture we consider two related algorithms for sorting that achieve a much better running time than the selection sort from an earlier lecture mergesort and quicksort We develop quicksort and its invariants in detail As usual contracts and loop invariants will bridge the gap between the abstract idea of the algorithm and its implementation Additional Resources Review slides https cs cmu edu 15122 handouts slides review 07 quicksort Code for this lecture https cs cmu edu 15122 handouts code 07 quicksort pdf tgz We will revisit many of the computational thinking algorithm and pro gramming concepts from the previous lectures We highlight the following important ones Computational Thinking We revisit the divide and conquer technique from the lecture on binary search We will also see the importance of randomness for the first time Algorithms and Data Structures We examine mergesort and quicksort both of which use divide and conquer but with different overall strate gies Quicksort is in place as it does not require allocating temporary memory but mergesort is not Programming We have occasionally seen recursion in specification func tions In both mergesort and quicksort it will be a central computa tional technique Both mergesort and quicksort are examples of divide and conquer We di vide a problem into simpler subproblems that can be solved independently and then combine the solutions As we have seen for binary search the ideal divide step breaks a problem into two of roughly equal size because it LECTURE NOTES Carnegie Mellon University 2024 Lecture 7 Quicksort 2 means we need to divide only logarithmically many times before we have a basic problem presumably with an immediate answer Mergesort achieves this quicksort not quite which presents an interesting trade off when con sidering which algorithm to chose for a particular class of applications Recall linear search for an element in an array which has asymptotic complexity of O n The divide and conquer technique of binary search divides the array in half determines which half our element would have to be in and then proceeds with only that subarray An interesting twist here is that we divide but then we need to conquer only a single new sub problem So if the length of the array is 2k and we divide it by two on each step we need at most k iterations Since there is only a constant number of operations on each iteration the overall complexity is O log n As a side remark if we divided the array into 3 equal sections the complexity would remain O log n because 3k 2log2 3 k 2log2 3 k so log2 n and log3 n only differ in a constant factor namely log2 3 1 The Quicksort Algorithm Quicksort uses the technique of divide and conquer in a different manner We proceed as follows 1 Pick an arbitrary element of the array the pivot 2 Divide the array into two segments the elements that are smaller than the pivot and the elements that are greater with the pivot in between the partition phase 3 Recursively sort the segments to the left and right of the pivot In quicksort dividing the problem into subproblems will be linear time but putting the results back together is immediate This kind of trade off is frequent in algorithm design Let us analyze the asymptotic complexity of the partitioning phase of the algorithm Say we have the array 3 1 4 4 7 2 8 and we pick 3 as our pivot Then we have to compare each element of this unsorted array to the pivot to obtain a partition where 2 1 are to the left and 4 7 8 4 are to the right of the pivot We have picked an arbitrary order for the elements in the array segments all that matters is that all smaller ones are to the left of the pivot and all larger ones are to the right Since we have to compare each element to the pivot but otherwise just collect the elements it seems that the partition phase of the algorithm Lecture 7 Quicksort 3 should have complexity O k where k is the length of the array segment we have to partition It should be clear that in the ideal best case the pivot element will be magically the median value among the array values This just means that half the values will end up in the left partition and half the values will end up in the right partition So we go from the problem of sorting an array of length n to an array of length n 2 Repeating this process we obtain the following picture At each level the total work is O n operations to perform the partition In the best case there will be O log n levels leading us to the O n log n best case asymptotic complexity How many recursive calls do we have in the worst case and how long are the array segments In the worst case we always pick either the small est or largest element in the array so that one side of the partition will be empty and the other has all elements except for the pivot itself In the ex ample above the recursive calls might proceed as follows where we have surrounded the unsorted part of the array with brackets pivot array 3 1 4 4 8 2 7 1 1 3 4 4 8 2 7 2 1 2 3 4 4 8 7 3 1 2 3 4 4 8 8 4 1 2 3 4 4 8 7 4 1 2 3 4 4 8 7 7 1 2 3 4 4 7 8 Lecture 7 Quicksort 4 All other recursive calls are with the empty array segment since we never have any unsorted elements less than the pivot We see that in the worst case there are n 1 significant recursive calls for an array of size n The k th recursive call has to sort a subarray of size n k which proceeds by partitioning requiring O n k comparisons This means that overall for some constant c we have n 1 cid 88 k 0 c k c n n 1 2 O n2 comparisons Here we used the fact that O p n for a polynomial p n is always equal to the O nk where k is the leading exponent of the polyno mial This is because the largest exponent of a polynomial will eventually dominate the function and big O notation ignores constant coefficients So quicksort has quadratic complexity in the worst case How can we mitigate this If we could always pick the median among the elements in the subarray we are trying to sort then half the elements would be less and half the elements would be greater So in this case there would be only log n recursive calls where at each layer we have to do a total amount of n comparisons yielding an asymptotic complexity of O n log n Unfortunately it is not …


View Full Document

CMU CS 15122 - Sorting

Download Sorting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sorting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sorting 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?