CS599: Approximation Algorithms April 13th, 2010Submodular FunctionsLecturer: David Kempe Scribe: Maheswaran Sathiamoorthy and Anurag Biyani1 IntroductionSo far, we have looked at individual optimization problems wherein we wanted to maximize or minimize acost function subject to some constraints. For example, we discussed minimizing the cost of cho osing edgessubject to the constraint of getting a Steiner Tree, or minimizing the cost of opening new facilities andconnecting cities to facilities subject to the constraint that every city b e connected to a facility.While there were some underlying trends in the techniques we have used, we more or less looked atproblems in isolation. For now, we are going to focus a little more systematically at which properties ofobjective functions and constraints make problems tractable. In particular, we will focus on submodularfunctions and matroid set systems.Definition 1 Let U be a universe, and f : 2U→ R be a function mapping subsets of U to real numbers.1. f is monotone if f(X) ≤ f(Y ) whenever X ⊆ Y .2. f is submodular if f(X ∪Y )+f (X ∩Y ) ≤ f(X)+f(Y ) for all sets X, Y . Equivalently, f is submodularif f(X ∪ {u}) − f(X) ≥ f(Y ∪ {u}) − f(Y ) whenever X ⊆ Y and for any elements u ∈ U.3. f is supermodular if f(X ∪ Y ) + f(X ∩ Y ) ≥ f(X) + f(Y ) for all sets X, Y . Equivalently, f issupermodular if f(X ∪{u})−f(X) ≤ f(Y ∪ {u}) − f (Y ) whenever X ⊆ Y and for any elements u ∈ U.4. f is modular if f (X ∪ Y ) + f(X ∩ Y ) = f(X) + f(Y ) for all sets X, Y .The second version of the definition of submodularity can be interpreted as follows: The extra “benefit”obtained by adding element u to set X is at least as much as the “benefit” obtained by adding the sameelement to a superset Y of X. Colloquially, this is called decreasing marginal return. In the discrete domain,submodularity plays a role similar to concavity in continuous domains.Supermodularity, on the other hand means that there can be large “benefit amplification” by combiningthe r ight elements; intuitively, this makes optimization much harder.By a deep result of Fleischer, Iwata and Fujishige, the set S minimizing a submodular function f (whethermonotone or not) can be found in polynomial time.Here, we will be interested in maximizing a submodular monotone function. In order to not get stuck ontrivial obstacles such as being unable to distinguish whether the objective function is positive or negative atthe optimum, we will also assume that f is non-negative. Since the most difficult case is when f is as smallas possible, we will call f normalized if f(∅) = 0. Some examples of submodular function are the following:1. f (S) =Pi∈Swifor some non-negative weights wi. These functions are in fact modular, and thus quiteeasy.2. Coverage Functions: Given a universe U and sets T1, . . . , Tm⊆ U, a coverage function is of the formf(S) = | ∪i∈STi|. (A generalization would associate values with elements.) The objective is similarto Set Cover; instead of trying to cover all elements with as few sets as possible, we try to cover asmany elements as possible with a given number of sets.An important special case is when f(S) is the total number of nodes v such that there is a u-v pathfrom some u ∈ S, i.e., the number of nodes reachable from S.3. The function f(S) = c(S,¯S), i.e., the capacity of a cut in a graph, is submodular, but not necessarilymonotone.12 Maximizing a Submodular FunctionHere, we are looking at the first optimization problem for submodular functions: Maximizing it subject toa size constraint. That is, given a function f that is monotone, submodular, and normalized, and a k ∈ N,find a set S with |S| ≤ k maximizing f(S). This problem is clearly NP-hard, since it contains Set Cover asa s pecial case. A natural approximation algorithm is the following:Algorithm 1 Algorithm(Greedy):Start with S = ∅.while |S| < k doAdd to S the element u maximizing f(S ∪ {u}).One interesting question is how to evaluate a generic function f. For many natural applications (such asall the ones listed above), this is possible in polynomial time. In general, since writing down all values of fas an input would take exponential space, we assume instead that we have an oracle that tells us f(S) forany given set S. The running time is then often measured in terms of the number of accesses to the oracle.Theorem 2 (Nemhauser,Wolsey) If f is monotone, submodular, and normalized, then the above greedyalgorithm is a1 −1eapproximation.By an approximation hardnes s result of Feige, even for the special case of coverage functions, it is NP-Hardto get a (1 −1e) + ǫ approximation for any ǫ > 0.Proof. Let Sibe our set after iteration i and OPT an optimum set. The gain δiin iteration i is δi=f(Si)−f(Si−1). By monotonicity of f , we have f(OPT) ≤ f(Si∪OPT) for all iterations i. By submodularityand the choice of element to add to the set, f(OPT) ≤ f(Si∪ OPT) ≤ f(Si) + kδi+1. Solving for δi+1givesδi+1≥1k(f(OPT) − f(Si)). And by definition of δi+1, we have f(Si+1) = f(Si) + δi+1. An interpretation ofthis is that each iteration makes up at least a1kfraction of the current distance from our solution to OPT.We will now prove the following by induction on i:f(Si) ≥ 1 −1 −1ki!· f(OPT).The base case i = 0 is trivial. For the induction step, we can simply write:f(Si+1) ≥ f(Si) +1k(f(OPT) − f(Si))=1 −1kf(Si) +1kf(OPT)≥1 −1k· 1 −1 −1ki!· f(OPT) +1k· f(OPT) (by Induction Hypothesis)= 1 −1k−1 −1ki+1+1k!· f(OPT)= 1 −1 −1ki+1!· f(OPT).Applying the induction hypothesis with i = k now gives us that f(S) ≥ (1 −1 −1kk) · f (OPT) ≥(1 − 1/e) ·
View Full Document