UW-Madison COMPSCI 787 - Lecture 2- Divide and Conquer Algorithms

Unformatted text preview:

CS787: Advanced Algorithms Lecture 2: Divide & Conquer Algorithms2.1 Divide and conquerDivide and conquer algorithms break up a problem into several smaller instances of the sameproblem, solve these smaller instances, and then combine the solutions into a solution to the originalproblem. Naturally, for this approach to work, the smaller instances should be simpler than theoriginal problem, and combining their solutions together should be easier to d o than solving theoriginal problem. In contrast to greedy algorithms, the correctness of D&C algorithms is usuallyrelatively easy to argue but the running time analysis is more involved. The analysis of the timecomplexity of these algorithms usually consists of deriving a recurrence relation for the complexityand then solving it.We give two examples of algorithms devised using th e d ivide and conquer technique, and analyzetheir running times. Both of these algorithms will deal with unordered lists.Sorting We describe mergsort—an efficient recursive sorting algorithm. We also argue that th etime complexity of this algorithm, O(n log n), is the best possible worst case running time asorting algorithm can have.Finding the k-th largest element Given an u nordered list of elements, we describe how to findthe k-th largest element in that list in linear time.2.1.1 MergesortThe goal of sorting is to turn an unordered list of n elements into an ordered list. Mergesort is arecursive sorting algorithm. It consists of the following two steps wh en ever the length of the list nis s tr ictly greater than one. When th e length of the list is equal to one, it returns the same list.1. Split the list in two halves that differ in length by at most one element, and sort each halfrecursively.2. Merge the two sorted lists back together into one big sorted list.The correctness of the algorithm relies on the merge step. This step can be performed in lineartime, and we skip the details.Let us now analyze the running time. Let T (n) denote the time it takes mergesort to sort a list ofn elements. The split step in the algorithm takes constant time and generates two lists of size dn2eand bn2c. When we merge th e two smaller lists, we end up with a list of n elements, so the mergestep takes O(n) time. Therefore, we get a r ecur s ive definition of T (n) of the formT (n) = T (dn2e) + T (bn2c) + O(n).1One way of solving this reccurrence is to determine the total amount of work don e at every level ofrecursion. In the case of mergesort, the total work done at any level is a constant times the totalsize of all the lists at that level, which is n. Therefore, the total work at any level is O(n). Thenumber of levels, on the other hand , is log n. Therefore, solving this recurrence relation yieldsT (n) = O(n log n).Ignoring constants, mergesort is asymptotically the fastest way to sort a list. In particular, anysorting algorithm that uses only comparisons to sort a list of integers must take time at leastΩ(n log n) in the worst case to sort a list of size n. (See [1] for a proof of this theorem.)2.1.2 Selection, or, finding the k-th Largest ElementIn this problem, we are once again given an unord ered list of elements, and want to find the kthlargest element. A simple way of solving this problem is to first sort the list and then read off thekth largest element. This takes time O(n log n). However, presumably finding only the kth largestelement should be simpler than sorting the entire list. For example, we could maintain a list of thek largest elements and populate this list in time O(n log k). When k is a small constant, this takesonly linear time. We w ill show that we can perform s election in linear time for an arbitrary k usinga divide and conquer approach.To get intuition for how this problem can be solved, suppose that we could find th e median of alist in linear time. We claim that we can then use th is as a subprocedure in a divide and conqueralgorithm to find the kth largest element. In particular, we use the median to partition the listinto two halves. Then we recursively find the desired element in one of the halves (the firs t half, ifk ≤ n/2, and the second half otherwise). This algorithm takes time cn at the fir s t level of recursionfor some constant c, cn/2 at the next level (since we recurse in a list of size n/2), cn/4 at the thirdlevel, and so on. The total time taken is cn + cn/2 + cn/4 + · · · = 2cn = O(n).Unfortunately, however, finding the median doesn’t seem to be much simpler than finding the kthlargest element. The key idea here is th at in order to app ly the recursion, we don’t need an exactmedian – a near-median would do. In particular, suppose we could find an element at every stepsuch that at least 3/10th of the elements in the list are smaller than it and at least 3/10th of theelements are larger than it, then we could still apply the same divide and conquer approach asabove. Assuming each divide step takes linear time, our running time would turn out to be at mostcn +710cn +49100cn + · · · = 3.33cn = O(n).Finally, it turns out that we can find a near-median in linear time by again applying recursion.In particular, we divide the list into groups of 5 elements each, find the median in each group inconstant time (since each group is of constant size), and then find the median of these med iansrecursively. The key point to note is that the final step of fin d ing the median of medians appliesto a much smaller list – of size n/5, and so we still get a small enough running time.This was just a r ough description and analysis of the algorithm. A more formal analysis follows.For simplicity of analysis, we assume that all the list sizes we encounter while running the algorithmare divisible by 5.2Algorithm for selection1. Divide the list into n/5 lists of 5 elements each.2. Find the median in each sublist of 5 elements.3. Recursively find the median of all the medians, call it m.4. Partition the list into elements larger than m (call this sublist L1) and those no larger thanm (call this sublist L2).5. If k ≤ |L1|, return Selection(L1, k).6. If k ≥ |L1| + 1, return Selection(L2,k − |L1|).The correctness of the algorithm is easy to argue and we will skip the argum ent. Let us analysethe running time. Note that we make two recursive calls. The first is to a list of size n/5. Thesecond is to either L1or L2. How large can these lists be? We argue that these lists can be nolarger than 7n/10 in size. This is because there


View Full Document

UW-Madison COMPSCI 787 - Lecture 2- Divide and Conquer Algorithms

Download Lecture 2- Divide and Conquer Algorithms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 2- Divide and Conquer Algorithms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2- Divide and Conquer Algorithms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?