DOC PREVIEW
Berkeley COMPSCI 294 - Lecture Notes

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009Lecture 1: August 27Lecturer: Pro f. Alistair Sinclair Scribes: Alistair SinclairDisclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.They may be distributed outside this class only with the permission of the Instructor.1.1 The Markov Chain Monte Carlo ParadigmAssume that we have a very large but finite set Ω and a positive weight function w : Ω → <+. Our goal isto sample x ∈ Ω with probability π(x) =w(x)Z, where the normalizing factor Z =Px∈Ωw(x), often calledthe “partition function”, is usually unknown. (Indeed, in many applications our ultimate goal will be toestimate Z.)Markov Chain Monte Carlo constructs a Markov Chain (Xt) on Ω that converges to π, ie Pr[Xt= y|X0=x] → π(y) as t → ∞, independent of x. Then we get a sampling algorithm by simulating the Markov chain,starting in an arbitrary state X0, for sufficiently many steps and outputting the final state Xt. It is usuallynot hard to set up a Markov chain that converges to the desired stationary distribution; however, the keyquestion is how many steps is “sufficiently many,” or equivalently, how many steps are needed for the chainto get “close to” π. This is known as the “mixing time” of the chain. Obviously the mixing time determinesthe efficiency of the sampling algorithm.In the remainder of this introductory lecture, we provide motivation for MCMC by sketching a number ofapplication areas in which random sampling from large, complex combinatorial sets arises. We focus onapplications in which rigorously justified algorithms can b e achieved; for a more practically oriented focus,see the excellent book by Jun Liu [L02].1.2 Applications1.2.1 CombinatoricsApplications in combinatorics include:• Examining typical members of a combinatorial set, which can be used, e.g., for fomulating and testingconjectures.For example, by sampling random 3-regular graphs on n vertices, we might formulate the conjecturethat they are (almost) all Hamiltonian; this conjecture is actually now a theorem.• Generating test data for algorithms.Often, testing an algorithm on completely random inputs (such as arbitrary random graphs) is unin-formative. MCMC can be used to generate inputs from a more complex class (such as sparse connectedgraphs), which can form the basis of more convincing tests of the algorithm.• Probabilistic constructions.The existence of certain objects (such as networks with desired connectivity properties) can be proven1-11-2 Lecture 1: August 27by the probabilistic method, but in many cases the probabilistic construction required is quite complexand it is not obvious how to realize it algorithmically. For example, recent constructions of efficient LowDensity Parity Check codes require a random bipartite graph with s pecified degrees for each vertex.It is not known how to generate such graphs directly, but they can be generated quite effi ciently byMCMC.A similar example is provided by certain models of the WWW, which are also based on random graphswith specified vertex degrees (some times with additional properties).• Approximate counting.A much less obvious and more far-reaching combinatorial application is to count the number of elementsof the set Ω, which might be (e.g.) the set of cliques in a graph G, or the set of satisfying assignments ofa boolean formula φ. Almost all such counting problems are #P-complete (which is the analog of NP-completeness for decision problems); however, in many cases MCMC provides an efficient randomizedapproximation algorithm.The general technique for reducing (approximate) counting to random sampling can be explained inthe following folksy scenario for counting the number of people in a large crowd Ω:1. Partition the crowd Ω into two parts, B and B = Ω − B, according to some property (such as“having black hair”).2. Estimate the proportion |B|/|Ω| of people with black hair by taking a small uniform sample fromthe crowd and letting p be the proportion of the sample that have black hair.3. Recursively estimate the size of B by applying the same technique to this smaller crowd but usingsome other property (such as “being male”). LetdNBbe this recursive estimate.4. Output the final estimatebN =dNB·1p.Notice that the choice of properties at each level of the recursion is essentially arbitrary; in particular,we do not require them to be independent. The only thing we require is that the proportion of peoplein the remaining crowd who have the property is bounded away from 0 and 1: this ensures that (i) thenumber of levels of recursion (until we get down to a crowd of size 1) is small; and (ii) the s ample sizeneeded to get a good estimate p at each level is small.To apply this technique in a more mathematical context, let Ω be the set of all cliques of a givengraph G. (This is a very large and complex set, whose size is in general exponential in the size of G.)We can partition Ω into those cliques that contain some vertex v, and those that do not. But the firstof these sets is equivalent to the set of all cliques in the graph Gv(the subgraph of G consisting of vwith all its neigbors); and the second set is equivalent to the set of all cliques in the graph G −v (thesubgraph of G with v removed). So both subsets correspond to instances of the same counting problemapplied to smaller graphs; hence they can be estimated recursively, as required by the method.1.2.2 Volume and IntegrationGiven as input a convex body K in <n, the goal is to estimate the volume of K. While in low dimensionsexact methods may be used, they become computationally intractable in higher dimensions. (In fact, theproblem is #P-complete when the dimension n is treated as part of the input.)The volume can be estimated by the following reduction to random sampling:Construct concentric balls B0⊂ B1⊂ B2. . . ⊂ Br, such that B0⊂ K ⊂ Br. By a non-trivial geometricresult, it can be assumed w.l.o.g. that B0is the unit ball, and that Brhas radius O(n log n), where n isLecture 1: August 27 1-3BBBB012rKFigure 1.1: Estimating the volume of a convex set by choosing a sequence of increasing balls and computingthe ratioV ol(K∩Bi)V ol(K∩Bi−1)the dimension. (This can be achieved by simple transformations of K.) The construction of the balls isillustrated by Figure 1.1. Then we haveV ol(K) =V ol(K ∩ Br)V ol(K ∩ Br−1)×V ol(K ∩ Br−1)V


View Full Document

Berkeley COMPSCI 294 - Lecture Notes

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?