The Set Cover ProblemGreedy Set CoverSet Systems of Bounded VC-DimensionPolytope CoveringCPS296.2 Geometric Optimization March 06, 2007Lecture 16: Greedy and randomized algorithms(Guest) Lecturer: Bardia Sadri Scribe: Albert YuIn this lecture we will study greedy and randomized algorithms for solving geometric packing and coveringproblems.16.1 The Set Cover ProblemDefinition 1 Given a set system (U, S) consisting of a universe U of n elements and a collection S ={S1, . . . , Sk} of subsets of U, and a cost function c : S → Q+, a minimum set cover is a minimum costsubcollection of S that covers all elements of U.For simplicity we only look at the uniform case where every set in S has cost 1. The results can be extendedto the non-uniform case.Definition 2 The frequency of an element of U is the number of sets in S that contain the element. Thefrequency of the most frequent element of U is denoted by f. The special case of f = 2 is also known as thevertex cover problem.Theorem 1 The MINIMUM SET COVER problem is NP-complete.16.1.1 Greedy Set CoverThe first approximation algorithm that we consider in this lecture is the greedy set cover algorithm.Algorithm 1 A greedy set cover algorithmInputs: S = {S1, . . . , Sk}, U = a universeOutput: S0= minimum set cover1: S0← {}2: while U 6= NULL3: Select Sj∈ S s.t. Sjcovers the most elements in U4: Remove the covered elements from U5: S0= S0∪ {Sj}6: end while16-1Lecture 16: March 06, 2007 16-27: return S0The greedy set cover algorithm works in iterations. At each iteration, the algorithm picks the set in S thatcovers the largest number of elements in U and removes the covered elements from U. It stops when U isemptied. The final picked sub-collection is returned then as the solution.Theorem 2 The greedy algorithm is an Hnfactor approximation algorithm for the minimum set cover prob-lem, where Hn= 1 + 1/2 + . . . + 1/n.Proof: We want to distribute the cost of adding a set Sito the elements of U that it covers for the first time.Since every set in S has cost 1, the cost of covering an element x is1|T |, where T is the set of all elementsthat get covered for the first time along with x. Let c(x) denote the cost of covering an element x. Note thatPx∈Uc(x) = size of the computed cover.Let ω denote the size of an optimal cover for U when the algorithm starts. At the beginning of each iteration,there is a cover of size ω or less that covers every element that is left in U — any subcollection of S thatcould cover the original U can cover what is left of it.Assume k elements are left in U at the beginning of an iteration. An important observation is that if there areω sets that cover all the remaining k elements, there exists a set that covers at least k/ω of what is left. Sincewe greedily choose the set that removes the most elements, at least k/ω elements would be removed in thecurrent iteration. Thus, c(x) ≤1k/ω= ω/k for all elements x which are removed in the current iteration.It is easy to see that total cost of all iterationsXx∈Uc(x) ≤nXi=1ωn − i + 1=1 +12+ . . . +1n· ω.The greedy algorithm achieves the best possible approximation as witnessed by the following results.Theorem 3 The minimum set cover problems has no (1 − o(1)) ln(n) approximation scheme unless NP hasquasi-polynomial time algorithms, i.e. N P ⊆ DT IMEnpolylog(n).(A proof of the above theorem can be found in [3]).16.1.2 Set Systems of Bounded VC-DimensionDefinition 3 Given a set system (U, S) consisting of a universe U of size n and a collection S of subsets ofU, a minimum hitting set is the smallest subset H of U that intersects (hits) every S ∈ S.Lecture 16: March 06, 2007 16-3Claim 1 MINIMUM SET COVER and MINIMUM HITTING SET are the equivalent problem.Sketch of Proof Let (U, S) be an instance for the set cover problem. We will make a dual instance (U0, S0)for the hitting set problem in such a way that the feasible solutions of the two instances make a one-to-onecorrespondence. For every x ∈ U define Sxas the collection of all the sets in S that contain x. The setsystem dual to (U, S) is (S, {Sx: x ∈ U }). It is not hard to verify that that set covers of the former setsystem correspond to hitting sets of latter and have the same cardinality. See figure16.1.Figure 16.1:Hitting sets are easier to work with since they resemble ε-nets. The difference between hitting sets and ε-netsis that every set that is small enough will be missed by ε-nets. To make the notion of an ε-net more flexiblewe define a weighted version of it.Definition 4 Suppose (U, S) is a set system and w : U → N is a weight function. The weight of a set S ∈ Sis simply the sum of the weights of its elements. A set N ⊆ U is a weighted ε-net for (U, S, w) if N ∩ S 6= ∅whenever S ∈ S and w(S) ≥ ε · w(U).Claim 2 The problem of computing weighted ε-nets for set systems can be reduced to that of computingunweighted ε-nets.Lecture 16: March 06, 2007 16-4Proof: Given a set system (U, S) and a weight function w : U → N, we build a set system (U0, S0) asfollows. U0contains w(x) distinct copies of every element x ∈ U. S0is made by taking every set S ∈ S andreplace each element of S with all its copies. It is not hard to observe that a usual ε-net for (U0, S0) leads to aweighted ε-net for (U, S, w) by picking an element of U for the weighted net if one of its copies is includedin the unweighted one. Importantly, it can be verified that the VC-dimension of (U0, S0) is no more than thatof (U, S) since no subset of U0that contains two copies of the same element can be shattered by S0.Algorithm 2 HITTING-SET (U, S)Let c be the size of a smallest hitting set.1: Initialize w(x) to 1 for all x ∈ U.2: while (1)3: find a weighted ε-net N for (U, S, w) with ε =12c.4: if N is a hitting set for (U, S)5: return N6: else7: find an arbitrary set S ∈ S not hit by N and double the weight of its elements.6: end whileThe idea of the algorithm is that if a set S is not hit by N , its weight is small compared to the weight of U .We increase the weight of S such that the algorithm will pay more attention to this set in the next iterations.Let H∗be an optimal hitting set. Since S must intersect H∗, there exists a x ∈ H∗such that w(x) will bedoubled. After k iterations, w(H∗) ≥ c × 2k/cand w(U) ≤ n(1 + ε)k. Since w(H∗) ≤ w (U ), we get thenumber of iterations k ≤ 4c ln(n/c).The algorithm requires the value of c to be known in order to assign a
View Full Document