Duke STAT 376 - On the Swapping Algorithm - D2996591

Home> Schools> Duke University> (STAT) > STAT 376> On the Swapping Algorithm

Duke STAT 376 - On the Swapping Algorithm

Course Stat 376- Advanced Modeling and Scientific Computing

Pages 32

Download Save

Unformatted text preview:

On the Swapping AlgorithmNeal Madras, Zhongrong ZhengDepartment of Mathematics and Statistics, York University, 4700 Keele Street,Toronto, Ontario M3J 1P3, Canada; e-mail: [email protected] 13 July 2001; accepted 18 May 2002DOI 10.1002/rsa.10066ABSTRACT: The Metropolis-coupled Markov chain method (or “Swapping Algorithm”) is anempirically successful hybrid Monte Carlo algorithm. It alternates between standard transitions onparallel versions of the system at different parameter values, and swapping two versions. We proverapid mixing for two bimodal examples, including the mean-field Ising model.© 2002 WileyPeriodicals, Inc. Random Struct. Alg., 22: 66 –97, 2003Keywords: Markov chain Monte Carlo; spectral gap; Metropolis-coupled Markov chains; mean-field Ising model; decomposition1. INTRODUCTIONMarkov chain Monte Carlo (MCMC) methods have become a standard tool for simulatingobservations from complicated distributions in a variety of fields of application, particu-larly physics (Binder and Heermann [1], Frenkel and Smit [12]) and statistics (Gilks,Richardson, and Spiegelhalter [15]). However, standard MCMC methods encounterserious difficulties if there are isolated modes in the target distribution from which wewish to sample. This is because the Markov chains get stuck in some local modes andrarely move between those isolated modes, and hence the rate at which the Markov chainsconverge to the target distribution is too slow to be practical. Isolated modes arise fairlyoften in statistical problems, and they are especially common in physics models thatCorrespondence to: N. Madras© 2002 Wiley Periodicals, Inc.66exhibit phase transitions. One of the classical examples of this is the Ising model at lowtemperature, which we shall describe below.To speed up the convergence rates of MCMC methods when sampling from multimo-dal distributions, several new approaches have been proposed. Among these, the followingthree algorithms are based on similar intuition: the swapping algorithm of Geyer [13], alsocalled Metropolis-coupled Markov chains or parallel tempering (Orlandini [24]); thesimulated tempering algorithm (Marinari and Parisi [21], Geyer and Thompson [14],Madras [18]); and the tempered transition method (Neal [23]).In practical use, these new methods seem to converge much faster than do standardmethods, but there is little rigorous theory to back these empirical observations. In thispaper, we shall focus on the swapping algorithm and rigorously show that it can be usedto sample from two examples of bimodal distributions. The examples are the following:Example I (Distribution with exponential valley on an interval). Fix a real constant C !1, and let J be a (large) positive integer. Consider the state space of all integers in theinterval ["J, #J], with the bimodal distribution!$x% "C!x!Z$x " "J, "J # 1, . . . , J%,where Z is the normalizing constant. Observe that as J gets large (with C fixed),!(0) isexponentially smaller than!(J). The analysis turns out to be simpler if we can divide thestate space into two exact halves. For this reason, we shall consider the more unusualstate space ! & !Int, which is defined to be the set of all odd integers in the interval["2M " 1, 2M # 1], with the distribution!$x% "!Int$x!C% "C!x!Zfor x ! !.(See Remark 3 at the end of Section 4 for a brief discussion of how one can handle thestate space {"J, "J # 1, . . . , J}.)Example II (Mean field Ising model). Fix a real constant$! 0 and let M be a (large)positive integer. The state space is defined to be ! & !Ising& {"1, #1}M. The mean fieldIsing model (in zero field) is the probability distribution!$x% "!Ising$x!$% "e$$¥j&1Mx'j(%2/2MZ$$%for x&$x'1(, . . . , x'M(%!!,where Z($) is the normalizing constant. It can be shown that if$! 1, then the distributionof the “total spin” ¥j&1Mx[j] is bimodal for large M, and the probability at the modes isexponentially larger than it is at zero (e.g., Madras and Piccioni [19]). For simplicity, asin Example I, we assume throughout this paper that M is an odd integer. (See Remark 3at the end of Section 4 for comments on the case of M even.)ON THE SWAPPING ALGORITHM 67Both examples can be expressed in the form of a Gibbs distribution in statisticalmechanics, that is,!$x% "e"$H$x%Z, (1)where H is an explicit function (corresponding to the energy of configuration x) and$isa nonnegative parameter (inversely proportional to the temperature). In Example I,H( x) & "!x! and$& log C.One of the most basic general-purpose Markov chain Monte Carlo methods forsampling from a distribution!is the Metropolis algorithm (for a recent theoretical review,see Diaconis and Saloff-Coste [8]). The method is the following. Suppose!is aprobability distribution supported on the finite set ". Let K be an irreducible, symmetricMarkov chain on " (called the proposal chain or base chain). In many applications K isa random walk on a regular graph. The Metropolis chain is the new Markov chain T on" defined as follows:T$x, y% "#K$x, y% if!$y% %!$x% and y & x,K$x, y%!$y%!$x%if!$y% '!$x%,1 ($z)xT$x, z%if y " x.(2)That is, if the Metropolis chain is currently at state x, then it chooses a proposed state yfrom the distribution K( x, y), but it only accepts y as the next state with probabilitymin{1,!( y)/!( x)} (if y is not accepted, then the next state is x again). The chain T isirreducible and reversible with respect to!; in particular,!is the equilibrium distributionof T (see Diaconis and Saloff-Coste [8, p. 20] for a proof of this). Therefore, we cansample from!by simulating the chain T for a sufficiently long time.In our examples, we take two natural Metropolis chains. For Example I, the base chainis simple symmetric random walk on !Int:K$i, j% "%12if !i ( j! " 2 or i " j " *$2M # 1%,0 otherwise.Thus the Metropolis chain for Example I looks like random walk on the interval !Intwithdrift away from the center. For large M, this chain requires an exponentially long time tohave much chance of reaching the negative half from initial state 2M # 1. For ExampleII, the base chain chooses a site i ! {1, . . . , M} uniformly at random and then reversesthe sign of x[i]. That is,K$x, y% "%1Mif x, y ! !Isingand &x ( y&1" 2,0 otherwise(where we write & z&1& !z[1]! #. . .# !z[M]! for z ! RM). The Metropolis chain ofExample II reaches equilibrium exponentially slowly in M for any fixed$! 1 (themeaning of this assertion will be made precise in Section 2).68 MADRAS AND ZHENGTo discuss the intuition

View Full Document


School:
Email:
New Password:
Confirm Password:

Duke STAT 376 - On the Swapping Algorithm

Sign up for free to view:

Please select your school