MIT 6 856J - Assignment 1 - D1887381

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 856J> Assignment 1

MIT 6 856J - Assignment 1

School name Massachusetts Institute of Technology

Course 6 856j- Randomized Algorithms

Pages 3

Download Save

Unformatted text preview:

� � � � � 6.856 — Randomized Algorithms David Karger Handout #24, December 5th, 2002 — Homework 11 Solutions Problem 1 (a) Let D be the disjoint union, and N := D|. We will denote by (a, x) ∈ D a particular |assignment/clause pair in the disjoint union. That is, assignment a satisﬁes clause x and is the “clause x” copy of assignment a in the disjoint union. Note that our random sample chooses any pair (a, x) ∈ D with probability 1/N . It follows that E[Xt] = (1/N ) · (1/ca)N · N · (a,c)∈D = 1/caca · a = 1 a which is just the number of satisfying assignments, as claimed. (b) We need to argue that we can estimate E[Xt] to within (1 ± ε) using the desired number of trials. But observe that Xt is a random variable whose value is bounded in the range [0, 1]. Thus the generalized Chernoﬀ bound of Homework 2, Problem 6 applies to sums of independent samples of Xt. Note also that E[Xt] ≥ 1/m since ca ≤ m in all cases. It follows that the sum of mµεδ samples has expected value at least µεδ , which means (by the deﬁnition of µεδ ) that the probability that this sum deviates by more than ε from its mean is at most δ. (c) To estimate ca, we will choose random clauses and check if a satisﬁes them. Note that the probability that a satisﬁes a randomly chosen clause is just ca/m. So we can estimate ca by estimating the probability a random clause satisﬁes a. We will refer to these random choices of clauses as “sub-trials” and reserve the word “trial” to refer to choosing a random assignment and then carrying out a series of sub-trials to estimate its coverage. We’ll use a δ� that we set later. We will keep choosing clauses until we see µεδclauses that a satisﬁes. We expect to have to choose (m/ca)µεδclauses to do this. Once we see this many clauses, we get an estimate c for ca that is accurate to within (1 ± ε) with probability 1 − δ�. It follows that 1/c = (1 ± 2ε)/ca, so that we have an O(ε)-accurate approximation to 1/ca with probability 1 − δ�. If each of our estimates of 1/ca is accurate to within 1 + O(ε), then our sum of estimates approximates the sum of 1/ca values to within O(ε), which in turn 1approximate the mean E[Xt] to within O(ε). The errors in these approximations multiply, so the accuracy of our approximation is (1 + O(ε))O(1), which is just 1 + O(ε). Of course, we asked for an ε approximation. To get an ε-accurate estimate instead of an O(ε)-accurate one we just use a constant-factor smaller initial ε. Now we can set δ�. We want our estimates for 1/ca to be accurate in all the trials we perform. Since we carry out O(mµεδ ) trials, to make the union bound work we will set δ� = δ/(mµεδ). This means the probability any trial hits a bad estimate from its sub-trials is at most 1/δ. One subtlety in this problem was realizing that if one approximated c ≈ ca to within an ε bound, then 1/c was also a legitimate 2ε approximation for 1/ca. This is not equivalent to asserting (incorrectly) that the inverse of the expectation of a variable is the same as the expectation of its inverse, because we are not in fact dealing with an expectation. We are simply approximating the value of a quantity, and then inverting that value to obtain an approximation for its inverse. (d) Let’s compute the expected running time of a trial, measured in terms of the number of clauses against which we test an assignment. As just discussed, if the result returned by the trial is 1/ca then the number of clauses we need to sample is O(m/ca)µεδ . It follows that the expected number of clauses we need to sample in one trial is O(E[(m/ca)µεδ ]) = O(mµεδ ) · E[1/ca] = O(mµεδ E[Xt]) On the other hand, recall that the goal of these trials is to estimate E[Xt], which means that we need to carry out about µεδ /E[Xt] trials to do so. It follows that the total work done is the product of these two quantities. This product cancels the quantity E[Xt], leaving us with a total time of O(mµ 2 εδ ) which is essentially linear in the formula size. We can improve the µ2 term to µ with a longer analysis. Rather than waiting for µεδ� sub-trials to yield satisﬁed clauses for a sub-trial, we stop as soon as we see one satisﬁed clause and, if we made s samples, we take 1/s as our estimate. The number of sub-trials we perform is then negative binomial distribution with mean m/ca, the right value. So we can use a Chernoﬀ bound on sums of negative binomial distributions (with diﬀerent means) to argue that we get an accurate estimate. Recall that this bound is the number of clause vs. assignment tests we carry out in the algorithm. The actual running time needs to take into account the number of variables we need to test in a clause, which depends on the formula size. (e) (optional) If some clauses in a DNF formula are much larger than others, then the probability that they become true is so small compared to the probability other clauses become true that we can discard those clauses from the formula without aﬀecting the truth probability signiﬁcantly. Once all clauses are the same size, our argument that we test roughly m clauses means that we test a number of variables roughly equal to the total size of the formula. 2� � Problem 2 1. Let cA be the coverage of assignment A, h the number of satisfying assignments in the disjoint union, and k the number of satisfying assignments. The probability that the algorithm outputs a value in a given attempt is (1/cA)(cA/h) = k/h. A Notice that the number of iterations before an assignment is output is geometrically dis-tributed with parameter p = k/h. The expected time before a “success” is 1/p. But we argued previously that k/h > 1/m. Thus the expected time to output a value is O(m). 2. The probability an assignment A is output in a given trial is Pr[A output | A picked] · Pr[A picked] = (1/c)(c/h) = 1/h It follows that the probability assignment A is output in a trial, given that there was an output in that trial, is (by the deﬁnition of conditional probabilities) Pr[stop and output A] = (1/h)/(k/h) = 1/k. Pr[stop] 3. When we choose a random assignment A (which takes O(m) time) we can estimate cA in O(mµε,δ /cA) time as shown in problem 4(c). With probability 1 − δ that estimate is within (1 ± ε) of the correct value.

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 856J - Assignment 1

Sign up for free to view:

Please select your school