The Branch and Bound AlgorithmThe Branch and Bound (B&B) algorithm is a general method to solve discreteoptimization problems. The key idea is that B&B attempts to prune partsof the exhaustive search tree by using bounds on the value of the optimumsolution. Below we present a formalized description of the method. In orderto understand how it works, let us review first how we can formalize theexhaustive search.Consider the following optimization problem. Let B be the set of all n-dimensional binary vectors and we would like to find the maximum of afunction f (x) over B, that is, we are looking formaxx∈Bf(x).Note: boldface letters will be used to denote vectors, such asx= (x1, . . . , xn), b = (b1, . . . , bn).We can formalize the exhaustive search as a recursive procedure. To presentit, let us introduce a few notations. Let Bk(b) denote the subset of B whichis obtained by fixing the first k coordinates of the binary vectors accordingto the coordinates of a given binary vector b:Bk(b) = {x∈ B | x1= b1, . . . , xk= bk}.That is, the first coordinate is fixed at b1, the second at b2, ... the kthis fixedat bkand the rest of the coordinates ofxis arbitrary (0 or 1). If k = 0,then it means we did not fix anything, so B0(b) = B. If k = n, then allcoordinates are fixed at the corresponding bivalue, so then the set Bn(b)contains only a single vector, b itself: Bn(b) = {b}.Let us introduce now the following family of functions:Fk(b) = maxx∈Bk(b)f(x).The meaning of Fk(b) is that this is the maximum value of f (x) if the maxi-mization is done over the restricted set Bk(b), that is, the first k coordinatesof x are fixed at the values determined by b and only the rest can change.1What is the advantage of the Fk(b) functions? The reason for we haveintroduced them is that we can easily construct a recursive algorithm tocompute them. Once we have this subroutine that computes Fk(b) for any kand b, then we can solve the original problem by simply calling the subroutinewith k = 0, since, by definition we haveF0(b) = maxx∈B0(b)f(x) = maxx∈Bf(x)for any b (so we can use, for example, b = 0 in this call).Let us see now how we can compute the Fk(b) values recursively. We presentan algorithm in pseudo-code below first for the exhaustive search.Exhaustive Search1. function Fk(b)2. if k = n then return f (b)3. else4. begin5. bk+1:= 0; u := Fk+1(b)6. bk+1:= 1; v := Fk+1(b)7. return max(u, v)8. endExplanation: In line 2 we handle the case when all coordinates are fixed,i.e., we have arrived at a leaf of the search tree. In this case Bn(b) = {b},so the maximum over this set can only be f (b). If k < n then in lines 5and 6 we compute recursively the function for two possible cases (branch):when the next coordinate is fixed at 0 (line 5) and when it is fixed at 1 (line6). The maximum of the two gives the value of Fk(b) in line 7, since this isthe maximum over the set when the (k + 1)thcoordinate is not fixed, onlythe first k. What guarantees that the recursion ends after a finite number ofsteps? This is guaranteed by the fact that the recursive call is always madefor a larger value of k and when the largest value k = n is reached then thereis no more recursive call.2Let us now turn to the Branch and Bound algorithm. Suppose that we havean upper bound function Uk(b) available, such thatFk(b) ≤ Uk(b)holds for any k and b. It is assumed that an independent subroutine isavailable that can compute Uk(b) for any parameter value.Let us introduce the variable maxlow that will carry the largest lower boundon the global optimum that has been achieved so far. We do not need a sepa-rate subroutine for computing a lower bound, because whenever we computea value f(x) for a particular x, it is automatically a lower bound on the max-imum. This is so, simply because the function at any x is either maximumor less, that is,f(x) ≤ maxy∈Bf(y)holds for any x.Now we can present the B&B algorithm by modifying the exhaustive searchin the following manner. For simplicity, let us assume that the function f (x)is nonnegative. Then the variable maxl ow can be initially set to 0.Branch and Bound1. function Fk(b)2. if k = n then3. begin4. if f (b) > maxlow then maxlow := f (b)5. return f (b)6. end7. else8. if Uk(b) > maxlow then9. begin10. bk+1:= 0; u := Fk+1(b)11. bk+1:= 1; v := Fk+1(b)12. return max(u, v)13. end14. else return maxlowExplanation. Observe that if we remove lines 4, 8 and 14 then we exactly3get back the exhaustive search (ignoring the unnecessary begin-end pairs).Thus, let us take a closer look how these lines change the exhaustive search.In the case k = n the only change is that before returning f(b) we updatethe value of maxlow in line 4: if the new function value is larger then theold value of maxlow, then this larger value will be the new best lower boundon the optimum, otherwise the old one remains.The most essential difference is in line 8, this is the bounding. Here the idea isthat we only make the recursive calls if the upper bound Uk(b) is larger thanthe current value of maxlow. Why we can do this without losing anything?Because if the condition in line 8 is not satisfied, thenmaxlow ≥ Uk(b) ≥ Fk(b)must hold. This means that from this parameter combination (k, b) wecannot hope improvement on the best value found so far, since the maximumin this subtree of the search tree (i.e., Fk(b)) is already known to be not morethan maxlow, so it makes no sense to explore this subtree. Thus, in this casewe do not make any recursive call and return maxlow in line 14 (just toreturn something).Note that for the above bounding step we need to compute the value of Uk(b),but we assumed that for this an independent subroutine is available. Thesavings may be essential if this subroutine is much faster then the exhaustiverecursive algorithm. This is typically the case in the successful applicationsof B&B.4Comments:• If we drop the assumption f (x) ≥ 0, the algorithm still remains thesame, just maxlow should be initialized to a value for which maxl ow ≤f(x) holds for all x.• If the original maximization problem allows only certain binary vectorsx, then we first extend the function to all binary vectors by assigning asmall enough function value to those vectors which are excluded in theoriginal maximization problem. (“Small enough” means here a valuethat is surely smaller than any function value on those vectors that arenot excluded in the original problem).• If we also want to find the maximizing vector x,
View Full Document