##
This **preview** shows page *1-2-3-19-20-38-39-40*
out of 40 **pages**.

*View Full Document*

End of preview. Want to read all 40 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document**Unformatted text preview:**

Chapter 3Markov Chain Monte CarloIn this chapter, we focus on a particular family of discrete time stochastic processesthat provide us with a tool to sample from probability distributions. Let X be acontinuous random variable with a probability density function g(x), our goal is togenerate samples from g. In case g is too complicated to use one of the direct meth-ods described in Chapter 2, one has to resort to approximate methods for samplingfrom g. One such method is the use of Markov chains to sample from complicatedprobability density functions, when the direct methods do not apply. The basic ideais to construct a Markov chain Xtn, starting from almost arbitrary initial conditions,but satisfying a set of conditions relating to g, so that as time n gets longer the valuesof Xtnstart resembling the samples of f .We start this chapter with a brief introduction to Markov chains, and then developa framework for their use in sampling from given probability density functions. Thematerial presented in this chapter is taken from [?, ?].3.1 Introduction to Markov ChainsDenote a discrete time stochastic process as: {Xt, t = t1,t2,. ..}. Such a process canbe characterized by nth-order joint probability density function, for any n ≥ 1,fXt1,Xt2,...,Xtn(x1,x2,. . .,xn) .Using the rule for total probability, we can factor the joint dendity function as:fXt1,Xt2,...,Xtn(x1,x2,. . .,xn) = fXt1(x1) fXt2|Xt1(x2|x1). . . fXtn|Xtn−1,...,Xt1(xn|xn−1,. . .,x2,x1) ,where the first term on the right side is the marginal density of the process at timet1, and the remaining terms are the conditional densities. The joint density functionis called translation invariant if for any c ≥0, we havefXt1,Xt2,...,Xtn(x1,x2,. . .,xn) = fXt1+c,Xt2+c,...,Xtn+c(x1,x2,. . .,xn) . (3.1)9192 3 Markov Chain Monte CarloDefinition 3.1. A stochastic process is called a Markov process iffXtn|Xtn−1,...,Xt1(xn|xn−1,. . .,x2,x1) = fXtn|Xtn−1(xn|xn−1) .A Markov process is a stochastic process whose past (Xtn−2,. . .Xt1) has no influenceon the future (Xtn) if its present (Xtn−1) is specified. This definition implies that jointdensity function can be written as a product of one-step conditional densities asfollows:fXt1,Xt2,...,Xtn(x1,x2,. . .,xn) = fXt1(x1) fXt2|Xt1(x2|x1). . . fXtn|Xtn−1(xn|xn−1) . (3.2)Note that to specify a Markov process one needs only a marginal density fXt1and asequence of one-step transition densities fXtn|Xtn−1. Therefore, the specification of aMarkov process avoids the needs for specifying all possible joint density functions.For the definition, one can establish the following properties associated withMarkov processes:1. The conditional expectation of the future, given the past and the present is equalto the conditional expectation of the future given only the present.E[Xtn|Xtn−1,. . .,Xt2,Xt1] = E[Xtn|Xtn−1] .Proof: By definition,E[Xtn|Xtn−1,Xtn−2,. . .,Xt1] =ZxnfXtn|Xtn−1,...,Xt1(xn|xn−1,xn−1,. . .,x1)dxn=ZxnfXtn|Xtn−1(xn|xn−1)dxn= E[Xtn|Xtn−1]2. A Markov process remains Markov if the time index is reversed. That is,fXt1|Xt2,...,Xtn(x1|x2,. . .,xn) = fXt1|Xt2(x1|x2) .Proof: Left side is:fXt1,Xt2,...,Xtn(x1,x2,. . .,xn)fXt2,...,Xtn(x2,. . .,xn)=fXt1(x1) fXt2|Xt1(x2|x1). . . fXtn|Xtn−1(xn|xn−1)fXt2(x2) fXt3|Xt2(x3|x2). . . fXtn|Xtn−1(xn|xn−1)=fXt1(x1) fXt2|Xt1(x2|x1)fXt2(x2)= fXt1|Xt2(x1|x2) .3. Conditioned on a given value of the present, the past and the future are statisti-cally independent of each other. That is, for t1< t2< t3we havefXt1,Xt3|Xt2(x1,x3|x3) = fXt1|Xt2(x1|x2) fXt3|Xt2(x3|x2) .Proof:3.1 Introduction to Markov Chains 93fXt1,Xt3|Xt2(x1,x3|x2) =fXt1,Xt3,Xt2(x1,x3,x2)fXt2(x2)= fXt3|Xt2(x3|x2) fXt1|Xt2(x1|x2)4. A Markov process satisfies a condition, called Chapman-Kolmogoroff condition,that will be useful later. For a Markov process,fXt3|Xt1(x3|x1) =ZfXt2|Xt1(x3|x2) fXt2|Xt1(x2|x1)dx2(3.3)Here are some example of the Markov processes. All stochastic processes withindependent increments are Markov processes. For t1< t2, the increment Xt2−Xt1isindependent of Xt1implies that the variable Xt2does not dependent on the past (t <t1) given Xt1. As another example, a random walk, as described next, is a Markovprocesses. For a fixed ∆ > 0 and s > 0, define the process at X(i∆) = X((i−1)∆)±swith probabilities 0.5 each. Since this defines a process with independent incre-ments, it is a Markov process. Among continuous time processes, a Wiener processis a Markov process.The concept of stationary stochastic processes is very important in developmentof MCMC methods.Definition 3.2. A stochastic process is called stationary if its nth-order joint densityfunction is translation invariance, for all n ≥ 1. In other words, for any collection oftimes {t1,t2,. . .,tn}, we havefXt1,Xt2,...,Xtn(x1,x2,. . .,xn) = fXt1+c,Xt2+c,...,Xtn+c(x1,x2,. . .,xn) , (3.4)for all n > 0 and c.In particular, for a stationary process the marginal density associated the processdoes not depend upon the time t,fXt(x) = fXt+c, ∀c ≥ 0 ,and for all t. Similarly, the conditional probability density is also invariant in time:fXtn|Xtn−1(xn|xn−1) = fXt2|Xt1(xn|xn−1) ,for all n. This invariance of probabilities across times makes stationary processesvery important for sampling. For example, in case we are interested in sampling aprobability density function g, we can potentially do so by constructing a stationarystochastic process such that the marginal density function at time time is g, i.e.fXt= g. Then, along a sample path, the value Xt, for any t, is a sample from g.Although this scheme seems simple, the problem comes in simulating a stationarystochastic process such that its marginal density function matches the given g. Thedifficulty of simulating such a stochastic process is same as simulating directly fromthe original density g. Since we do not know how to simulate directly from g, wecan not simulate this stochastic process. However, there is a possibility of simulating94 3 Markov Chain Monte Carloanother type of process, called a homogeneous Markov process, that is not stationaryto start with but becomes stationary as t → ∞. Furthermore, it is relatively easy tosimulate this process. This idea is the underlying principle behind MCMC methods.Next, we study the construction and properties of homogeneous Markov processes.Definition 3.3. A Markov process is called homogeneous if the conditional densityis invariant to a time shift. That

View Full Document