122S:138Posterior predictive checkingLecture 22Nov. 26, 2007Kate Cowles374 SH, [email protected] checking and sensitivity analy-sis• goal: assess fit of model to– data– our substantive knowledg• must check effects of– prior– likelihood specification– hierarchical structure– any other application-specific issues∗ e.g. which predictor variables3• theoretically possible to set up and fit a “su-per model” including all possibly true models– but computationally infeasible– and really conceptually impossible• instead we fit a feasi ble number of modelsand examine the posterior distributions thatresult– cast models as broadly as possible– fail to fit reality?– sensitive to arbitrary specifications?4Principles and methods of model-checking• “do the model’s defici e ncies have a noticeableeffect on substantive inferences?”• how to judge when assumptions of conve-nience can be made safely5Using the posterior distribution to checka statistical model• compare posterior distribution of parametersto– substantive knowledge– other data• compare po sterio r predictive distribution offuture observations to substantive knowledge– e.g.: compare election predictions from amodel to substantive knowledge• compare posterior predictive distribtuion offuture observations to the data that have ac-tually occurred6Using the posterior predictive distribu-tion to check a statistical model• recall:– posterior: conditional on observed data y– predictive: prediction of an observable butunobserved y–p(˜y|y) =Zp(˜y, θ|y)dθZp(˜y|θ, y)p(θ|y)dθZp(˜y|θ)p(θ|y)dθ– last line holds if new data are condition-ally independent of old data given mo delparameters7Checking a model by comparing thedata that we have to the posterior pre-dictive distribution• enables checking fit of model without anymore substantive knowledge than is in ex-isting data and model• do datasets simulated from the model we fit“look like” the real data in ways relevant toour inference?• requires drawing “replicated data”8Procedure to draw a “replicated dataset”from posterior predictive distribution• notation– y: observed data– yrep: a complete simulated dataset∗ same number of observations as in y∗ same values of explanatory variables (ifany)∗ response variables simu lated from pos-terior predictive distribution– θ: vector of all unknown model parame-ters, including parameters of upper stagepriors if model is hierarchical9• Step 1: draw θ∗from p(θ|y)i.e. from posterior distribution of θ• Step 2: draw yrepfrom p(yrep|θ∗• repeat steps 1 and 2 a large number of times10Discrepancy measures or test quanti-ties for posterior predictive checks• intended to measure discrepancy between modeland real data• T (y, θ): scalar summary of data (and pos-sibly parameters) used as a stand a rd whencomparing real data to data si mulated fromposterior predictive distribution• choose one or more test quantities that aremeaningful with respect to your research pur-pose11Using the test quantities: posterior pre-dictive p-values• compute T (y, θ) for the real data y• compute T (y, θrep) for each simulated repli-cate dataset• compute the proporti o n of the repli c a ted datasetsfor which T (yrep, θ) ≥ T (y, θ)• this is an approximation to the Bayes p-valueZ ZI(T (yrep,θ)≥T (y,θ))p(θ|y)p(yrep|θ) dθ dyrep• that is, Bayes p-value is P r(T (yrep, θ) ≥T (y, θ)) with the probabili ty taken over thejoint posterior distribution of θ and yrep12Evaluating outliers in Newcomb’s speedof light data• from GCSR textbook• 66 measurements of speed of light; two lowoutliers• what we want to evaluate: is normal densityok for likelihood?• defined T (y, θ) as min(yi)– to check whether data with such extremeoutliers could reasonably have come froma normal model13• Fit model to the 66 observationsyi∼ N (µ, σ2), i = 1, . . . , 66p(µ, σ2) ∝1σ2• generated 20 replicate datasets• found that in all replicate d atasets, min(yrepi)was much larger than min(yi) in real data14Interpreting and using posterior pre-dictive p-values• not Pr(model is true | data)• posterior probability that T (yrep, θ) ≥ T (y, θ)x)• ideal is if posterior predictive p-value is some-where around .5– would mean that real data y is typical ofdata that comes from the model• model is suspect if tail-area probabil ity ofmeaningful test qu antity is close to either 0or 1– would mean that aspect of data bein g mea-sured by test quantity is i nconsistent withmodel– extreme ppp-value indicates that modelneeds to be changed or expanded∗ in Newcomb example, use t or contam-inated normal
View Full Document