CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009Lecture 7: September 29Lecturer: Alistair Sinclair Scribes: Alistair SinclairDisclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.They may be distributed outside this class only with the permission of the Instructor.In today’s lecture we will continue the discussion on path coupling and see how it can be applied to boundthe mixing time of a Markov chain defined on the space of all linear extensions of a partial order. Towardsthe end of the lecture we will introduce yet another type of coupling called monotone coupling.7.1 Mixing time using path couplingSupp ose we have a pre-metric d on the state space Ω of an ergodic Markov chain. If we can define a coupling(X, Y ) 7→ (X0, Y0) for pairs of adjacent (in the pre-metric) states (X, Y ) such thatE[d(X0, Y0)|X, Y ] ≤ (1 − α)d(X, Y ) (7.1)for some 0 ≤ α ≤ 1, then this coupling can be extended to all pairs (X, Y ) which also satisfy (7.1). Moreover,if α > 0 and d is integer-valued then we have τmix= O(α−1log D), where D = maxx,yd(x, y) is the maximumdistance between any pair of states under the metric extension of d. The proof of this follows along the samelines as the final step in our analysis of path coupling for the colorings Markov chain in the previous lecture.Now consider the case in which (7.1) holds with α = 0. Then one cannot expect the above re sult to betrue since the distance does not decay geometrically as before. Still one can prove by standard martingalearguments (exercise!) that, when d is integral,τmix= O(β−1D2),where β = minX,Y ∈ΩEhd(X0, Y0)−d(X, Y )2iis the variance of the change in d. (Note that a crude boundon the above would be β = minX,Y ∈ΩP|d(X0, Y0) − d(X, Y )| ≥ 1.)Caution: In the definition of β, it is not sufficient to take the minimum over adjacent pairs (X, Y ) (exer-cise).7.2 Linear Extensions of a Partial OrderIn this section we illustrate path coupling in the context of sampling a linear extension of a partial orderuniformly at random.Input: A partial order1 on V = {1, 2, . . . , n}.1A partial order (V, ) is a binary relation over a set V which is reflexive, antisymmetric, and transitive, i.e., for all a, b,and c ∈ V , we have, (i) a a, (ii) a b and b a =⇒ a = b, (iii) a b and b c =⇒ a c.7-17-2 Lecture 7: September 29A linear extension of is a total order v on V which respects , i.e. for all x, y ∈ V x y implies x v y.Hence a linear extension of can be written as a permutation σ = (σ(1), σ(2), . . . , σ(n)) of {1, 2, . . . , n}such that σ(i) ≤ σ(j) if vi vj.Note that, given a partial order on V = {1, 2, . . . , n}, one can easily construct a linear extension of (exercise). For example, in figure 7.1, the right hand picture represents a linear extension of the partialorder shown on the left.aabb????????gcgcd????????defefFigure 7.1: A partial order and one of its linear extensionsGoal: Sample a linear extension of uniformly at random.Let Ω = Ω() denote the set of all linear e xtensions of . Being able to sample from Ω has a variety ofapplications in combinatorics, near-optimal sorting and decision theory. Brightwell and Winkler [BW91]showed that counting linear extensions is #P -complete. But if we have an efficient algorithm for sampling(almost) uniformly from Ω, then that can be used recursively, as described in lecture note 1, to get an efficientapproximation algorithm for |Ω|.As usual, we propose to sample from Ω by constructing an ergodic Markov Chain on state space Ω, whosestationary distribution is uniform. Define a Markov Chain on Ω as follows.Markov Chain:1. With probability 1/2 do nothing.2. Else pick a p osition p ∈ {1, 2, . . . , n − 1} uniformly at random.3. Exchange elements at position p and p+1 if “legal”, i.e., if the resulting total order is a linear extensionof .It is easy to verify that this Markov Chain is symmetric and aperiodic. Irreducibility of the chain followsfrom the following exercise.Exercise: Prove that it is always possible to reach one linear extension from another linear extension usingat mostn2“legal” exchanges of consecutive elements.Hence the above chain is ergodic and converges to the uniform distribution π on Ω. To apply path coupling,we need first to define a pre-metric on the state space Ω.Lecture 7: September 29 7-3Pre-metric:Two states X and Y in Ω are adjacent iff they differ atexactly two positions, say, i and j, 1 ≤ i < j ≤ n. Wedenote this by X = Y ◦ (i, j). In this cas e, the distanced(X, Y ) between X and Y is defined to be j − i (see figure7.2).Exercise: Check that the above is indeed a pre-metric.This pre-metric extends to a metric on Ω, denoted againby d. Note that this metric may b e rather complicated todescribe, but the power of path coupling lies in the factthat we never need to do so. Next we define the couplingfor adjacent states.X Y• •••1ijn••• •1ijnFigure 7.2: Two adjacent states7.2.1 Coupling for adjacent pairsLet (X, Y ) be a pair of adjacent states. Let i and j be the positions where they differ, 1 ≤ i < j ≤ n. Thenthe transition (X, Y ) 7→ (X0, Y0) is defined as follows.Case 1: If j 6= i + 1,• With probability12do nothing in both.• Else choos e the same p from {1, 2, . . . , n − 1} uniformly at random in both and exchange elements atposition p and p + 1 if “legal”.Case 2: If j = i + 1,• With probability12(n−1)do nothing in X and pick p = i in Y and exchange the elements at i and j inY . (Note that the elements at i and j are incomparable in and hence this exchange is always legal.)• With probability12(n−1)pick p = i in X and make a similar exchange and do nothing in Y .• With probabilityn−22(n−1)do nothing in both.• Else choose same p from {1, 2, . . . , n − 1} \ {i} uniformly at random in both and exchange elements atposition p and p + 1 if “legal”.(Note that X0and Y0may no longer be adjacent
View Full Document