Unformatted text preview:

Markov Chain Monte Carlo22S:138, Bayesian StatisticsLecture 10Oct 1, 2008Kate Cowles374 SH, 335-0727The Poisson distribution (one moreone-parameter distribution)• The Poisson distribution may be appropriatewhen the data are counts of rare events.• events occurring at random at a co nstant rateper u nit time, distance, volume, or whatever• assumption that th e numbe r of events thatoccur in any interval is independent of thenumber of events occurring in a disjoi ntinterval• examples:– the number of cases of a rare form of canc eroccurring in Johnson County in eachcalendar year– the number of flaws occurring in ea ch100-foot len gth of yarn produ c ed by aspinning machine– the number of particl es of pollen per cubicfoot o f air in this room• Since the values of a random variablefollowing a Poisson distribution are counts,what are the possible values?• probabili ty mass function for a Poissonrandom variabl ep(y|λ) =e−λλyy!, y = 0, 1, . . .• the count of the number of events occurring inm time units also follows a Poissondistribution, but with parameter mλ• The conjugate prior distribution for thePoisson rate parameter is the gamma family.Markov Chain Monte Carlo Methods• Goals– to make inference about model parameters– to make predictions• Requires– integration over possibly high-dimensionalintegrand– and we may not know the integratingconstantMonte Carlo integration and MCMC• Monte Carlo integration– draw independent samples from requireddistribution– use sample averages to approximateexpectations• Markov chain Monte Carlo (MCMC)– draws samples by runn ing a Markov chainthat is constructed so that its limitingdistribution is the joint distribution o finterestMarkov chains• A Markov chain is a sequence of randomvaria bles X0, X1, X2, . . .• At each time t ≥ 0 the next state Xt+1issampled from a distributionP (Xt+1|Xt)that depends only on the state at time t– called “transition kernel”• Under certain regularity condi tions, theiterates from a Markov chain will graduallyconverge to draws from a unique stationaryor invariant distrib ution– i.e. chain will “forget’ its initial state– as t increases, sampled points Xtwill lookincreasingly like (correlated) samples fromthe stati o nary distributi o n• Suppose:– MC is run for N (large number) iterations– we throw away outp ut from first miterations– regularity conditions are met• then by ergodic theorem– we can use averages of remaining samplesto estimate meansE[f(X)] ≃1N − mNXt=m+1f(Xt)Gibbs sampling: one way to constructthe transition kernel• seminal references– Geman and Geman (IEEE Trans. Pattn.Anal. Mach. Intel., 1984)– Gelfand and Smith (JASA, 1990)– Hastings (Biometrika, 19 70)– Metropolis, Rosenbluth, et al. (J. Chem.Phys, 1953)• subject to regulari ty conditions, joi ntdistribution is uni quely determined by “ fullconditional distributions”– full conditional distribution for a modelquantity is distributio n of that quantityconditional on assumed known values of allthe other quantities in the model• break complicated, high-dimensional probleminto a large number of simpl er,low-dimensional problemsExample: Inference about normal meanand variance, both unknown• modelyi|µ, σ2∼ N(µ, σ2)i = 1, . . . , N• priorsµ ∼ N(µ0, σ20)σ2∼ IG(a1, b1)• We want posterior means, posterior medians,posterior credi ble sets for µ, σ2Full Conditional Distributions forNormal Model• to extract mathe matical form of fullconditional for a parameter:– write out expression to which joi ntposterior is proportional– pull out all terms containing the parameterof interestGibbs Sampler algorithm for Normal1. choose initial values µ(0), σ2(0)2. at each iteration t, generate new value foreach parameter, con ditional on most recentvalue of all other pa rametersWhat are BUGS and WinBUGS?• “Bayesian inference Using Gibbs Sampling”• general purpose pro g ra m for fitting Bayesi anmodels• developed by David S piegelhalter, AndrewThomas, Nicky Best, and Wally Gilks at theMedical R e search Council Biostatistics Unit,Institute of Public He alth, in Cambridge, UK• BUGS– for Unix and DOS platforms– written in Modula-2; distributed incompiled form only• WinBUGS– for Windows– written in Component Pascal running inOberon M icrosystems’ Blackboxenvironment– able to fit a wider variety of models thanBUGS can handle– undergoing continuing d evelopment• excellent documentation, including twovolumes of exampl es• Web page:http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml• OpenBUGS– open source version of WinBUGS– interfaces easily with R– Web page:http://mathstat.helsinki.fi/openbugs/What do BUGS and WinBUGS do?• enable user to specify model in simpleSplus-like lan g uage• construct the transition kernel for a Markovchain with the joint posterior as its stationarydistribution, a nd simulate a sample path ofthe resu lting chain– determine whether or not the fullconditional for each unknown quantity(parameter or missing data) in the model isa stan dard density.– generate random variates from standarddensities using standard algorithms.– BUGS uses the adap tive rejectionalgorithm (Gilks and Wild, AppliedStatistics, 1992) to generate fromnonstandard full cond itionals;consequently can handle only log-concaveor discrete full conditiona ls– WinBUGS uses Metropolis algorithm togenerate from nonstandard full conditionalsand is not subject to this limitationThe Art and Science of MCMC Use• Deciding how many chains to run• Choosing initial values– Do not co nfuse initial val ues with priors!– Priors are part of the model. Initial valuesare part of the computing strategy used tofit the model.– Priors must not be based on the currentdata.– The best choices of initial values are valuesthat are in a high-posterior-density regionof th e parameter space. If the prior is notvery strong, then maximum likelihoodestimates (from the current data) areexcellent choices of initial values if they ca nbe calcu lated.– In the simple models we have encounteredso far, the MCMC sampler will convergequickly even with a poor choice of initialvalues.– In more compl icated models, choosinginitial values in low posterior densityregions may make the sampler take a hugenumber of iterations to finally startdrawing from a good approximation to thetrue posterior.• Assessing whether sampler has “converged”– How


View Full Document

UI STAT 4520 - Bayesian Statistics

Documents in this Course
Load more
Download Bayesian Statistics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bayesian Statistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bayesian Statistics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?