CS599 (Spring 2009) - Takehome FinalDue in David Kempe’s office by 12:00 noon on Wednesday, 05/13No late submission will be granted. As on the midterm, you cannot discuss problems, ideas, or solutionsto this midterm with anyone (in or outside of class) except the instructor. Also, as a reminder, you are notallowed to seek solutions to these problems online or elsewhere. If you are stuck on a problem and needhints, or you need help understanding a problem, you are welcome to e-mail me, or talk to me in person.Problem 1In class, we used a coupling argument to prove that the random walk on the d-dimensional hypercube mixesin time O(d log d). Reprove that the random walk mixes rapidly using the technique of canonical paths.(Probably, the bound you will obtain on the time it takes to be nearly mixed will be worse than the one weobtained using coupling. So long as it is polynomial in d, that is not a problem.)Problem 2(a) Consider the random walk on the line {1, 2, . . . , n}, with a stalling probability of12. (I.e., with proba-bility12, the random walk does not move.) Prove that the random walk mixes within O (n2) steps, inthe sense that the TV distance to the stationary distribution is at most14after O(n2) steps for thiswalk.(b) Generalize the result to the d-dimensional grid {1, 2, . . . , n}d. Prove that the random walk with stallingprobability12on the d-dimensional grid mixes in O(n2· d log d) steps, in the same sense as above. (Youcan use the first part of this problem even if you didn’t solve it.)Problem 3The Bin Packing problem is defined as follows: you are given n items, with the size of item i beingsi∈ (0, 1), a fractional number between 0 and 1. You are given a supply of bins, each of size exactly 1. Youwant to pack all the items in bins, and minimize the number of bins. That is, the objective function is thenumber of bins used to pack all items. Of course, a bin may be filled only partially, but you cannot overfillany bin. This problem is known to be NP-complete.(a) Give and analyze a simple deterministic algorithm using at most 2OPT + 1 bins, where OPT is theoptimum number of bins to use.(b) We now improve the approximation guarantee to (1 + 2/e)OPT + 1, by using LP-rounding. We saythat an n-dimensional 0-1 vector a is feasible if the items i with ai= 1 can all fit into one bin, i.e.,Piaisi≤ 1. Let a1, . . . , ambe all feasible 0-1 vectors. Notice that m will be exponentially large in n.We can now write the following IP for Bin Packing:MinimizePjxjsubject toPjajixj≥ 1 for all ixj∈ {0, 1} for all j.Of course, we can turn this into an LP in the usual way. The meaning of the variable xjis that if xj= 1,then the set of items with aji= 1 is put into a bin together.While this LP has exponential size, and is thus not obviously solved, you get to assume that you havea fractional optimum solution xj. In fact, you get to assume that the solution only has p olynomially manyxjentries which are non-zero, and you are given this list of non-zero entries explicitly. So you don’t have todeal with exp onentially many variables any more.Show how to use this solution in order to obtain a solution using at most (1 + 2/e)OPT + 1 bins. (Hint:you may want to use your solution to part (1).)1Problem 4Here, we prove that random graphs tend to be good edge expanders. Recall that the definition of the edgeexpansion of a graph G isα(G) := minS⊆V,|S|≤n/2|e(S,S)||S|,i.e., the smallest ratio of edges leaving S divided by the size of S. Thus, graphs with large expansion don’thave bottlenecks separating large parts of the graphs, whereas graphs with small expansion do.Suppose that we generate a random graph as follows: each node v gets to have c log n outgoing edges,and the other endpoints of those edges are chosen independently and uniformly at random. For simplicity,we allow parallel edges and self loops. Thus, each node just chooses c log n edge endpoints, and creates edgesto all of them. Prove that for some constant c of your choice (but independent of n), the resulting graph hasexpansion at least12, with high probability (say, at least 1 − 1/n).(This actually holds not only for degree c log n, but if every node has degree at least 3. For degree 3, itis somewhat tricky to prove, and for degree 6, it is a bit easier, but still a little cumbersome. You only needto prove it for c log n here.)Problem 5Sometimes, randomized algorithms are really good at saving space. Imagine the following scenario: you havea sequence of m items, each from a universe U of size n. Items may repeat, possibly many times, and youwant to know how even or uneven the distribution is. Specifically, if item i appears a total of xitimes inthe sequence, you want to know the variancePi(xi− m/n)2=Pix2i− m2/n. Since m, n are easy to keeptrack of, this is equivalent to calculatingPix2i.Now, this would be a very easy problem if you could simply count how many times you have seen eachelement i. Unfortunately, you don’t have enough space for n separate counters. You only want to keep trackof one number instead. One way to accomplish this — surprisingly — is as follows: for each i, generateindependently a number ξi, which is ±1 with probability12each. Start out with a counter Y = 0. Now,whenever you see another occurrence of element i, add ξito Y , i.e., increment or decrement Y , according tothe element i.(a) Prove that EY2 =Pix2i.(b) The previous part shows that in expectation, this method lets you calculatePix2i. Unfortunately,the variance of Y2is quite high. To do better, we can run in parallel k independent copies of thisalgorithm, with independent random choices ξi,j. Call the resulting counters Y1, . . . , Yk. Then, weinstead keep track of Z :=1k·Pkj=1Y2j. Bound the probability that Z deviates fromPix2iby morethan a (1 ± δ) factor, in terms of δ and
View Full Document