UI STAT 4520 - Gibbs Variable Selection

Unformatted text preview:

Gibbs Variable SelectionTopicsOverview of Variable Selection ProceduresOverview of Variable Selection ProceduresSlide 5Slide 6Slide 7Overview of Bayesian Variable Selection ProceduresSlide 9Gibbs Variable SelectionHow to Implement in WinBUGSSlide 12Slide 13How to Implement in WinBUGSSlide 15How to implement in WinBUGSHow to Implement GVS in WinBUGSSlide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37RecommendationsSlide 39ConclusionsAppendix 1 Simulated Data Creation CodeAppendix 2 Simulated Data Correlation MatrixAppendix 3 Code to Fit full model in WinBUGSAppendix 4 GVS variable selection codeSlide 45Appendix 5 R code to name modelsReferencesGibbs Variable SelectionXiaomei Pan, Kellie Poulin, Jigang Yang, Jianjun ZhuTopics1. Overview of Variable Selection Procedures 1. Overview of Variable Selection Procedures 2. Gibbs Variable Selection (GVS)2. Gibbs Variable Selection (GVS)3. How to implement in WinBUGS3. How to implement in WinBUGS4. Recommendations 4. Recommendations 5. Appendices5. AppendicesOverview of Variable Selection Procedures •Selecting the best model –The best likelihood–The link function–Priors–Variable SelectionOverview of Variable Selection Procedures•The type of response variable typically narrows our choices for:•The best likelihood•The link function•Also, we often only have a relatively small number of priors to choose from•If no prior information is known non-informative priors are chosen for the model parameters.•If prior information is known, this will also narrow our choices for priorsOverview of Variable Selection Procedures•However there may be many choices for what variables should be included in the model•In many real world problems the number of candidate variables is in the tens to hundreds.•For example 10 candidate variables leads to 1024 different possible linear models •More models are possible if considering interactions and non-linear terms!Overview of Variable Selection Procedures•Things to keep in mind when doing variable selection–P-values for variables are no longer valid.•Variable selection is a form of “data snooping” (recall “multiple comparison procedures”)–Correlation between the predictors could lead to less than optimal results.–It is best to use cross-validation techniques as a safeguard–Best used in “prediction models” rather than “effect models”Overview of Variable Selection Procedures•Variable Selection Methods•Frequentists use methods such as –Stepwise Regression –Mallow’s CP–Maximum R-squared•What methods do Bayesians use?Overview of Bayesian Variable Selection Procedures MethodUsed For Ease of Use References*Gibbs Variable Selection (GVS)Variable Selection Moderately Easy 2, 3, 7.SSVS (Stochastic Search Variable Selection) Variable Selection Somewhat difficult 2, 5.Kuo and Mallick (Unconditional Priors for Variable Selection ) Variable Selection Very Easy 2, 6.Carlin and ChibGeneral Model SelectionModerately Easy 1, 2.Reversible JumpMostly Variable SelectionModerately Difficult 4 and also see Matt Bognar’s thesis in 241 SH* Refer to reverence slideOverview of Bayesian Variable Selection Procedures MethodAdvantages DisadvantagesGibbs Variable Selection (GVS)Pseudo-priors do not affect the posterior distribution. Easy to implement using WinBUGS.Requires pseudo-priors on all model coefficients whose sole function is to increase the efficiency of the sampler. SSVS (Stochastic Search Variable Selection) Can fit a wide variety of models. Allows the user to indicate which models they think are more likely.Requires pseudo-priors on all model coefficients and candidate models.Kuo and Mallick (Unconditional Priors for Variable Selection ) Extremely straightforward, only required to specify the usual prior on full parameter vector( for the full model) and the conditional prior distributions replace the pseudo-priors.No flexibility to alter the method to improve efficiency. If, for any parameter, the prior is diffuse compared with the posterior, the method becomes inefficient. Carlin and ChibFlexible Gibbs sampling strategy for any situation involving model uncertainty.Computationally demanding. Must specify efficient pseudo-priors becomes too time consuming if there are a large number of model under consideration Reversible Jump*No need for pseudo-priors. Maybe faster than GVS. Diffuse priors will often lead to the fewest parameter model being chosen. Cannot implement in WinBUGS.* Thanks to Dr. Matt Bognar for insights into reversible jump. His thesis in 241 SH contains examples of using this method.Gibbs Variable Selection •GVS Sampling ProcedureLikelihood: Y[i] ~ dnorm(mu[i],tau) mu[i]= β0 + β1* X1 *γ1 + β2* X2 *γ2 + … + βp* Xp *γp Prior: γi ~ dbern(0.5) #γ=1 means this variable is selected βi ~γi*Real Prior+(1-γi)*Pseudo Prior tau ~ dgamma(1.0E-3,1.0E-3)How to Implement in WinBUGS•We adapted code from Ntzoufras, I. (2003)–This code and the paper is available on the WinBUGS web site–The example showed variable selection for a model with 3 candidate predictor variables–This code required the user to modify the code extensively if they wanted to use it for their own dataHow to Implement in WinBUGS•Our WinBUGS code–Requires no changes in the model specification–The user must only insert their data and modify initial values.•p=number of x variables•N=Number of observations•Models=number of models (2^p)•Initial values for beta’sHow to Implement in WinBUGS•Provided at the end of this presentation–Full WinBUGS code for variable selection– Code for fitting the full model in WinBUGs (for development of Pseudopriors)•This code also only requires the user to change the data and initial values–R code to assist in interpreting output from WinBUGS–SAS Code used to develop sample dataHow to Implement in WinBUGS EXAMPLE•Data used for example–Validated our code using published results for a model with 3 variables–Created a simulated data set with 500 observations and 10 predictors•9 predictors were continuous•1 predictor was binary•Created a version of the file with correlation between predictors to test robustness of GVS to non-orthogonal data–Code used to create simulated data is in Appendix 1–Full correlation matrix (for correlated data) is in Appendix 2How to Implement


View Full Document

UI STAT 4520 - Gibbs Variable Selection

Documents in this Course
Load more
Download Gibbs Variable Selection
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Gibbs Variable Selection and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Gibbs Variable Selection 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?