DOC PREVIEW
Duke STA 216 - Lecture 16

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

OutlineEfficient Posterior Computation in Factor ModelsUnderlying Normal ModelsGeneralized Latent Trait ModelsFormulationGenetic Epidemiology IllustrationStructural Equation ModelsSTA 216, GLM, Lecture 16October 29, 2007Efficient Posterior Computation in Factor ModelsUnderlying Normal ModelsGeneralized Latent Trait ModelsFormulationGenetic Epidemiology IllustrationStructural Equation ModelsHow can we do efficient computation?IEfficient posterior computation in factor analysis models isvery challengingITypical Gibbs sampler can be subject to extremeslow-mixingICentering does not provide complete solution - can onlycenter one measurement for each latent variable &eliminates conjugacy unless prior non-exchangeableIWhat to do?Parameter Expansion (PX)IOriginally proposed as a method for speeding upconvergence of EM algorithmIRedundant parameters are carefully introduced to allowfaster convergence of EM & better mixing Gibbs samplersIThe idea is to induce the redundant parameters in such away as to avoid changing the target distribution in theMCMC algorithmIHence, the posterior is not changed, but one reducesautocorrelation.PX in Hierarchical ModelsIIt is very difficult to obtain a PX-accelerated Gibbssampler in general casesIHard to avoid changing the target distributionIGelman (2005, Bayesian Analysis) proposes to use PX toinduce a better prior, while also speeding up mixing in thesetting of variance component modelsIGhosh & Dunson (2007) extend to factor analysis modelsHomework ExerciseIPropose a PX Gibbs sampler for the sperm concentrationlatent factor regression model from lecture 15ISimulate data under the model and compare the PX Gibbssampler to a typical Gibbs sampler without PXIDue - next FridayWhat if our data are categorical?IThe above models assume that the different elements of yiare continuous and normally distributedIIn most settings in which factor analysis models are used,at least some of the elements of yiare instead orderedcategoricalIIt is appealing to have a general framework for modeling ofcorrelated measurements having different scales(continuous, binary, ordinal)Underlying Normal ModelsITo solve this problem, we can considering the followingmodification to the measurement model:yij= gj(y∗ij; τj), j = 1, . . . , pyi= µ + Ληi+ i, i∼ N3(0, Σ),IHere, yiare the observed variables & y∗iare normal latentvariables underlying yiIgj(·; τj) = link function possibly involving thresholdparameters τjLink functions in underlying normal modelsIFor continuous items (i.e., yijis continuous), gjis chosen asthe identity linkIFor binary items (yij∈ {0, 1}), we choose a threshold linkyij= 1(y∗ij> 0)IFor ordered categorical items (yij∈ {1, . . . , cj}), wegeneralize the binary case to letyij=cjXl=1= l 1(τj,l−1< y∗ij≤ τjl),with τj0= −∞, τj1= 0, τj,cj= ∞.Some CommentsINote that we are using an underlying multivariate normalmodel to characterize dependence in observations having avariety of scalesIIn factor analysis, the dependence is induced throughshared dependence on the latent factorsIPosterior computation is straightforward using a dataaugmentation Gibbs sampler, which imputes the y∗ijfromtheir truncated normal full conditional distributions.IOther sampling steps proceed as if y∗ijwere observed dataGeneralized Latent Trait ModelsIUnderlying normal specification induces normal linearmodels on the continuous items & probit-type models oncategorical itemsIStructure is restrictive - may prefer to use a different GLMfor each item, while allowing dependenceIReplace underlying normal measurement model withgeneralized latent trait model (GLTM):ηij= µj+ λ0jξi,where ηij=linear predictor in GLM for outcome type j,ξi= (ξi1, . . . , ξir)0=vector of latent traitsComments on GLTMsIGLTMs allow modeling also with count outcomes & formore flexible models for the individual items (e.g., logistic,complementary log-log, etc instead of just probit)IImportant - latent traits impact both dependence in thedifferent elements of yi& lack of fit in the individual itemGLMsIHarder to fit such models routinely, though adaptiverejection sampling & other tricks possibleDangers of GLTMsIDual role of latent variable component in accommodatingdependence & lack of fit individual item links createsproblems in interpretationIConsider the case in which yijis a 0/1 indicator of adisease, with i indexing family of j indexing individualwithin a familyIFollowing model commonly used to assess within-familydependence in probability of diseaselogitPr(yij= 1 | xi, β, ξi)= x0iβ + ξi, ξi∼ N(0, ψ)Application to Genetic Epidemiology StudiesIξi= difference in the log-odds of disease for family irelative to the population averageISuch differences among families are commonly attributableto genetic effectsIThe estimated value of ψ is used to infer the magnitude ofthe genetic componentIDiseases having small ψ will exhibit limited within-familydependence & hence should have a small genetic componentGenetic Epidemiology Example (continued)IAnything wrong with this interpretation?IIt has been shown that one can identify the fixed effects, β,& the random effects variance, ψ even if data are onlyavailable for a single individual per family.IHow can this be??IRandom effect included to allow within-family dependence→ with a single individual per family no need for a randomeffect?Punch LineIWe have identifiability even with a single individual perfamily because induced link function is no longer logisticIIn particular, we have a logistic-normal link function:Pr(yi= 1 | xi, β, ψ) = gψ(x0iβ)=Z{1 + exp(−x0iβ + ξi)}−1(2πψ)−1/2exp−12ψξ2idξi,IShape of the link function varies as ψ varies, so we canestimate β, ψ even with a single subject per familySome Further CommentsIIs ψ interpretable as a genetic heterogeneity in this case?IWhat if we have a few families with multiple individuals,and many with a single individual?IAnswer: ψ measures both lack-of-fit in the logistic link &heterogeneity among families.ITo obtain reliable inferences on genetic heterogeneity, youshould use a flexible link functionSome General Comments about GLTMsIUsed simple random intercept genetic epi example asillustrationINeed to worry about these issues just as much in morecomplex settings involving multivariate outcomes havingdifferent scalesINormal & underlying normal more robust to such issues,since one does not change the link in marginalizing outlatent variablesITo


View Full Document

Duke STA 216 - Lecture 16

Download Lecture 16
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 16 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 16 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?