DOC PREVIEW
UMD CMSC 351 - Lecture 14: HeapSort Analysis and Partitioning

This preview shows page 1 out of 4 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture Notes CMSC 251 Heapify A 1 m fix things up An example of HeapSort is shown in Figure 7 4 on page 148 of CLR We make n 1 calls to Heapify each of which takes O log n time So the total running time is O n 1 log n O n log n Lecture 14 HeapSort Analysis and Partitioning Thursday Mar 12 1998 Read Chapt 7 and 8 in CLR The algorithm we present for partitioning is different from the texts HeapSort Analysis Last time we presented HeapSort Recall that the algorithm operated by first building a heap in a bottom up manner and then repeatedly extracting the maximum element from the heap and moving it to the end of the array One clever aspect of the data structure is that it resides inside the array to be sorted We argued that the basic heap operation of Heapify runs in O log n time because the heap has O log n levels and the element being sifted moves down one level of the tree after a constant amount of work Based on this we can see that 1 that it takes O n log n time to build a heap because we need to apply Heapify roughly n 2 times to each of the internal nodes and 2 that it takes O n log n time to extract each of the maximum elements since we need to extract roughly n elements and each extraction involves a constant amount of work and one Heapify Therefore the total running time of HeapSort is O n log n Is this tight That is is the running time n log n The answer is yes In fact later we will see that it is not possible to sort faster than n log n time assuming that you use comparisons which HeapSort does However it turns out that the first part of the analysis is not tight In particular the BuildHeap procedure that we presented actually runs in n time Although in the wider context of the HeapSort algorithm this is not significant because the running time is dominated by the n log n extraction phase Nonetheless there are situations where you might not need to sort all of the elements For example it is common to extract some unknown number of the smallest elements until some criterion depending on the particular application is met For this reason it is nice to be able to build the heap quickly since you may not need to extract all the elements BuildHeap Analysis Let us consider the running time of BuildHeap more carefully As usual it will make our lives simple by making some assumptions about n In this case the most convenient assumption is that n is of the form n 2h 1 1 where h is the height of the tree The reason is that a left complete tree with this number of nodes is a complete tree that is its bottommost level is full This assumption will save us from worrying about floors and ceilings With this assumption level 0 of the tree has 1 node level 1 has 2 nodes and up to level h which has 2h nodes All the leaves reside on level h Recall that when Heapify is called the running time depends on how far an element might sift down before the process terminates In the worst case the element might sift down all the way to the leaf level Let us count the work done level by level At the bottommost level there are 2h nodes but we do not call Heapify on any of these so the work is 0 At the next to bottommost level there are 2h 1 nodes and each might sift down 1 level At the 3rd level from the bottom there are 2h 2 nodes and each might sift down 2 levels In general at level j 44 Lecture Notes CMSC 251 Total work for BuildHeap 3 1 3 2 1 0 1 0 2 2 2 0 1 0 0 1 0 0 1 4 0 0 8 Figure 13 Analysis of BuildHeap from the bottom there are 2h j nodes and each might sift down j levels So if we count from bottom to top level by level we see that the total time is proportional to T n h X j2h j j 0 h X 2h j j 2 j 0 If we factor out the 2h term we have T n 2h h X j 2j j 0 This is a sum that we have never seen before We could try to approximate it by an integral which would involve integration by parts but it turns out that there is a very cute solution to this particular sum We ll digress for a moment to work it out First write down the infinite general geometric series for any constant x 1 X 1 xj 1 x j 0 Then take the derivative of both sides with respect to x and multiply by x giving X jxj 1 j 0 X 1 1 x 2 jxj j 0 x 1 x 2 and if we plug x 1 2 then voila we have the desired formula X 1 2 1 2 j 2 j 2 2 1 1 2 1 4 j 0 In our case we have a bounded sum but since the infinite series is bounded we can use it instead as an easy approximation Using this we have T n 2h h X X j j h 2 2h 2 2h 1 j j 2 2 j 0 j 0 Now recall that n 2h 1 1 so we have T n n 1 O n Clearly the algorithm takes at least n time since it must access every element of the array at least once so the total running time for BuildHeap is n 45 Lecture Notes CMSC 251 It is worthwhile pausing here a moment This is the second time we have seen a relatively complex structured algorithm with doubly nested loops come out with a running time of n The other example was the median algorithm based on the sieve technique Actually if you think deeply about this there is a sense in which a parallel version of BuildHeap can be viewed as operating like a sieve but maybe this is getting too philosophical Perhaps a more intuitive way to describe what is happening here is to observe an important fact about binary trees This is that the vast majority of nodes are at the lowest level of the tree For example in a complete binary tree of height h there is a total of n 2h 1 nodes in total and the number of nodes in the bottom 3 levels alone is 2h 2h 1 2h 2 7n n n n 0 875n 2 4 8 8 That is almost 90 of the nodes of a complete binary tree reside in the 3 lowest levels Thus the lesson to be learned is that when designing algorithms that operate on trees it is important to be most efficient on …


View Full Document

UMD CMSC 351 - Lecture 14: HeapSort Analysis and Partitioning

Download Lecture 14: HeapSort Analysis and Partitioning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14: HeapSort Analysis and Partitioning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14: HeapSort Analysis and Partitioning and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?