Chapter 5 Divide and Conquer Divide and conquer refers to a class of algorithmic techniques in which one breaks the input into several parts solves the problem in each part recursively and then combines the solutions to these subproblems into an overall solution In many cases it can be a simple and powerful method Analyzing the running time of a divide and conquer algorithm generally involves solving a recurrence relation that bounds the running time recursively in terms of the running time on smaller instances We begin the chapter with a general discussion of recurrence relations illustrating how they arise in the analysis and describing methods for working out upper bounds from them We then illustrate the use of divide and conquer with applications to a number of different domains computing a distance function on different rankings of a set of objects nding the closest pair of points in the plane multiplying two integers and smoothing a noisy signal Divide and conquer will also come up in subsequent chapters since it is a method that often works well when combined with other algorithm design techniques For example in Chapter 6 we will see it combined with dynamic programming to produce a space ef cient solution to a fundamental sequence comparison problem and in Chapter 13 we will see it combined with randomization to yield a simple and ef cient algorithm for computing the median of a set of numbers One thing to note about many settings in which divide and conquer is applied including these is that the natural brute force algorithm may already be polynomial time and the divide and conquer strategy is serving to reduce the running time to a lower polynomial This is in contrast to most of the problems in the previous chapters for example where brute force was exponential and the goal in designing a more sophisticated algorithm was to achieve any kind of polynomial running time For example we discussed in 210 Chapter 5 Divide and Conquer Chapter 2 that the natural brute force algorithm for nding the closest pair among n points in the plane would simply measure all cid 3 n2 distances for a polynomial running time of cid 3 n2 Using divide and conquer we will improve the running time to O n log n At a high level then the overall theme of this chapter is the same as what we ve been seeing earlier that improving on brute force search is a fundamental conceptual hurdle in solving a problem ef ciently and the design of sophisticated algorithms can achieve this The difference is simply that the distinction between brute force search and an improved solution here will not always be the distinction between exponential and polynomial 5 1 A First Recurrence The Mergesort Algorithm To motivate the general approach to analyzing divide and conquer algorithms we begin with the Mergesort Algorithm We discussed the Mergesort Algorithm brie y in Chapter 2 when we surveyed common running times for algorithms Mergesort sorts a given list of numbers by rst dividing them into two equal halves sorting each half separately by recursion and then combining the results of these recursive calls in the form of the two sorted halves using the linear time algorithm for merging sorted lists that we saw in Chapter 2 To analyze the running time of Mergesort we will abstract its behavior into the following template which describes many common divide and conquer algorithms Divide the input into two pieces of equal size solve the two subproblems on these pieces separately by recursion and then combine the two results into an overall solution spending only linear time for the initial division and nal recombining In Mergesort as in any algorithm that ts this style we also need a base case for the recursion typically having it bottom out on inputs of some constant size In the case of Mergesort we will assume that once the input has been reduced to size 2 we stop the recursion and sort the two elements by simply comparing them to each other Consider any algorithm that ts the pattern in and let T n denote its worst case running time on input instances of size n Supposing that n is even the algorithm spends O n time to divide the input into two pieces of size n 2 each it then spends time T n 2 to solve each one since T n 2 is the worst case running time for an input of size n 2 and nally it spends O n time to combine the solutions from the two recursive calls Thus the running time T n satis es the following recurrence relation 5 1 A First Recurrence The Mergesort Algorithm 211 5 1 For some constant c when n 2 and T n 2T n 2 cn T 2 c The structure of 5 1 is typical of what recurrences will look like there s an inequality or equation that bounds T n in terms of an expression involving T k for smaller values k and there is a base case that generally says that T n is equal to a constant when n is a constant Note that one can also write 5 1 more informally as T n 2T n 2 O n suppressing the constant c However it is generally useful to make c explicit when analyzing the recurrence To keep the exposition simpler we will generally assume that parameters like n are even when needed This is somewhat imprecise usage without this assumption the two recursive calls would be on problems of size cid 17 n 2 cid 18 and cid 12 n 2 cid 13 and the recurrence relation would say that T n T cid 17 n 2 cid 18 T cid 12 n 2 cid 13 cn for n 2 Nevertheless for all the recurrences we consider here and for most that arise in practice the asymptotic bounds are not affected by the decision to ignore all the oors and ceilings and it makes the symbolic manipulation much cleaner Now 5 1 does not explicitly provide an asymptotic bound on the growth rate of the function T rather it speci es T n implicitly in terms of its values on smaller inputs To obtain an explicit bound we need to solve the recurrence relation so that T appears only on the left hand side of the inequality not the right hand side as well Recurrence solving is a task that has been incorporated into a number of standard computer algebra systems and the solution to many standard recurrences can now be found by automated means It is still useful however to understand the process of solving recurrences and to recognize which recurrences lead to good running times since the design of an ef cient divide and conquer algorithm is heavily intertwined with an understanding of how a recurrence relation determines a running time Approaches to Solving Recurrences There are two basic ways one can go about
View Full Document