DOC PREVIEW
UI STAT 4520 - Bayesian Statistics

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Markov Chain Monte Carlo22S:138, Bayesian StatisticsLecture 10Sep. 23, 2009Kate Cowles374 SH, 335-0727The Poisson distribution (one moreone-parameter distribution)• The Poisson distribution may be appropriatewhen the data are counts of rare events.• events occurring at random at a constant rateper unit time, distance, volume, or whatever• assumption that the numb er of events thatoccur in any interval is independent of thenumber of events occurring in a disjointinterval• examples:– the number of cases o f a rare form of canceroccurring in Johnson County in eachcalendar year– the number of flaws occurring in each100-foot length of yarn produced by aspinning machine– the number of particles of pollen per cubicfoot of air in this room• Since the valu e s of a random variablefollowing a Poisson distribution are counts,what are the possible values?• probabili ty mass function for a Poissonrandom variablep(y|λ) =e−λλyy!, y = 0, 1, . . .• the count of the number of events occurring inm time uni ts also follows a Poissondistribution, b ut with parameter mλ• The conjugate prior distribu tion for thePoisson rate parameter is the gamma family.Markov Chain Monte Carlo Methods• Goals– to make inference about model parameters– to make predictions• Requires– integration over possibly high-dimensionalintegrand– and we may not know the integratingconstantMonte Carlo integration and MCMC• Monte Carlo integration– draw independent samples from requireddistribution– use sample averages to approximateexpectations• Markov chain Monte Carlo (MCMC)– draws samples by running a Markov chainthat is constructed so that its limitingdistribution is the joint distribution ofinterestMarkov chains• A Markov chain is a sequence of ran domvariables X0, X1, X2, . . .• At each time t ≥ 0 the n e xt state Xt+1issampled from a distributionP (Xt+1|Xt)that depends only on the state a t time t– called “transition kernel”• Under certain regu larity conditions, theiterates from a Markov chain will graduallyconverge to draws from a un ique stationaryor invariant distribution– i.e. chain will “forget’ its initial state– as t i ncreases, sampled points Xtwill lookincreasingly like (correlated) samples fromthe stati o nary distribution• Suppose:– MC is run for N (large number) iterations– we throw away output from first miterations– regularity conditions are met• then by ergodic theorem– we can use averages of remaining samplesto estimate meansE[f(X)] ≃1N − mNXt=m+1f(Xt)Gibbs sampling: one way to constructthe transition kernel• seminal references– Geman and Geman (IEEE Trans. Pattn.Anal. Mach. Intel., 1984)– Gelfand and Smith (JASA, 1990)– Hastings (Biometrika, 19 7 0 )– Metropolis, Rosenbluth, et al. (J. Chem.Phys, 1 953)• subject to regulari ty condition s, jointdistribution is uni quely determined by “fullconditional distributions”– full conditional d istribution for a modelquantity is distribution of that quantityconditional o n assumed known values of allthe o ther quantities in the model• break complicated, h igh-dimensional probleminto a large number of simpler,low-dimensional problemsExample: Inference about normal meanand variance, both unknown• modelyi|µ, σ2∼ N(µ, σ2)i = 1, . . . , N• priorsµ ∼ N(µ0, σ20)σ2∼ IG(a1, b1)• We want posterior means, posterior medians,posterior credible sets for µ, σ2Full Conditional Distributions forNormal Model• to extract mathemati ca l form of fullconditional for a parameter:– write out exp ressio n to which jointposterior is proportional– pull out all terms containing the p arameterof interestGibbs Sampler algorithm for Normal1. choose initial values µ(0), σ2(0)2. at each iteration t, generate new valu e foreach parameter, conditional on most recentvalue of all other parametersWhat are BUGS and WinBUGS?• “Bayesian inference Using Gibbs Sampling”• general purpose program for fitting Bayesianmodels• developed by David S piegelhalter, AndrewThomas, Nicky Best, and Wally Gi lks at th eMedical R esea rch Council Biostatistics Unit,Institute of Public He alth, in Cambridge, UK• BUGS– for Unix and DOS platforms– written in Modula-2; distributed incompiled form only• WinBUGS– for Windows– written in Component Pascal running inOberon Microsystems’ Blackboxenvironment– able to fit a wider variety of models thanBUGS can handle– undergoing continuing de velopment• excellent documenta ti o n, including twovolumes of example s• Web page:http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml• OpenBUGS– open source version of WinBUGS– interfaces easily with R– Web page:http://mathstat.helsinki.fi/openbugs/What do BUGS and WinBUGS do?• enable user to specify model in simpleSplus-like language• construct the transition kernel for a Markovchain with the joint posterior as its stationarydistribution, and simulate a sample path ofthe resu lting chain– determine whether or not the fullconditional for each unknown quantity(parameter or missing data) in the model isa stan dard density.– generate random variates from standarddensities using standard algorithms.– BUGS uses the adaptive rejectionalgorithm (Gi lks and Wi ld, AppliedStatistics, 1992) to generate fromnonstandard full cond itionals;consequently can handle only l o g -concaveor discrete full conditiona ls– WinBUGS uses M e tropolis algorithm togenerate from nonstandard full conditionalsand i s not subject to this limitationThe Art and Science of MCMC Use• Deciding how many chains to run• Choosing initial values– Do not co nfuse initial valu es with priors!– Priors are pa rt of the model. Initi a l valuesare p a rt of the computing strategy used tofit the model.– Priors must not be based on the currentdata.– The best choices of i nitial values are valu e sthat a re in a high-posterior-density regionof th e parameter space. If the prior is notvery strong, then maximum likelihoodestimates (from the current data) areexcellent choices of initia l values if they canbe calculated.– In the simple models we have encounteredso far, the MCMC sampler will conve rgequickly even with a poor choice of initialvalues.– In more compl icated models, choosinginitial values in l ow posterior densityregions may make the sampler take a hugenumber of iterations to fin ally startdrawing from a good approximation to thetrue posterior.• Assessing whether sampler has


View Full Document

UI STAT 4520 - Bayesian Statistics

Documents in this Course
Load more
Download Bayesian Statistics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bayesian Statistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bayesian Statistics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?