DOC PREVIEW
Berkeley COMPSCI 294 - A Hierarchical Bayes Approach to Variable Selection for Generalized Linear Models

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A Hierarchical Bayes Approach toVariable Selection for Generalized Linear ModelsXinlei Wang and Edward I. George∗February 2004AbstractFor the problem of variable selection in generalized linear models, we develop variousadaptive Bayesian criteria. Using a hierarchical mixture setup for model uncertainty,combined with an integrated Laplace approximation, we derive Empirical Bayes andFully Bayes criteria that can be computed easily and quickly. The performance of thesecriteria is assessed via simulation and compared to other criteria such as AIC and BICon normal, logistic and Poisson regression model classes. A Fully Bayes criterion basedon a restricted region hyperprior seems to be the most promising.Keywords: AIC; BIC; EMPIRICAL BAYES, FULLY BAYES; LAPLACE APPROXI-MATION.∗Xinlei Wang is Assistant Professor, Department of Statistical Science, Southern Methodist University,3225 Daniel Avenue, 107 Heroy Building, Dallas, Texas 75275-0332, [email protected]. Edward I. Georgeis the Universal Furniture Professor, Statistics Department, The Wharton School, 3730 Walnut Street 400JMHH, Philadelphia, PA 19104-6340, [email protected]. This work was supported by NSF grantDMS-0130819.11 IntroductionThe variable selection problem for Generalized Linear Models (GLMs) may be stated as fol-lows. Suppose we observe Y = (y1, . . . , yn)Twhich follows an exponential family distributionp(Y|θ, φ) =nYi=1exp½yiθi− b(θi)φ+ c(yi, φ)¾, (1)where θ = (θ1, . . . , θn)Tand φ are unknown parameters that may depend on p observedvariables X1. . . , Xp. Let γ = 1, 2, . . . , 2pindex all subsets of these variables and let qγdenotethe size of the γth subset. Then the vaguely stated problem we consider is that of selectingthe “best” model of the formg(E(Y)) = Xγβγ, (2)where g is a known link function, Xγis a n × (qγ+ 1) design matrix with 1’s in the firstcolumn and the γth subset of Xj’s in the remaining columns, and βγis a (qγ+ 1) × 1 vectorof regression coefficients.There has been substantial recent interest in Bayesian variable selection for GLMs, forexample Raftery and Richardson, 1993; George, McCulloch, and Tsay, 1994; Raftery, 1996;Dellaportas and Forster, 1999; Clyde, 1999; Dellaportas, Forster and Ntzoufras, 2000 and2002; Ntzoufras, Dellaportas and Forster, 2002; and Meyer and Laud, 2002. In this paper,we propose new selection criteria for GLMs based on extensions of the hierarchical Bayes for-mulations of George and Foster (2000) and Cui (2002). These extensions are obtained usingan integrated Laplace approximation that yields analytical tractability, thereby bypassing theneed for computation via simulation methods such as MCMC. By choosing particular hyper-parameter values, we obtain model posteriors with modes corresponding to the commonlyused AIC and BIC selection criteria for GLMs. We then proceed to develop and evaluatenew selection criteria based on both Empirical Bayes (EB) and Fully Bayes (FB) approaches.Simulation evaluations are used to compare the performance of the various criteria for normal,logistic and Poisson linear models.The article is organized as follows. Section 2 introduces a general hierarchical mixtureBayesian setup for the variable selection problem, and Section 3 describes a particular imple-mentation for GLMs. Section 4 develops an analytically tractable integrated Laplace approx-imation for GLMs with canonical links. Section 5 proposes particular EB and FB selectioncriteria based on this approximation. Section 6 describes the straightforward generalization ofall these results for noncanonical link GLMs. Section 7 provides a simulation evaluation andcomparison of various selection criteria including ours. Section 8 concludes with a discussion.22 A Hierarchical Bayes Setup for Variable SelectionTo model variable selection uncertainty for the general GLM setup in (1) and (2), we considerprior formulations of the formπ(βγ, γ|ψ1, ψ2) = π(βγ|γ, ψ2)π(γ|ψ1) (3)where ψ1and ψ2are unknown hyperparameters indexing the priors on γ and βγ, respectively.Such prior distributions lead to posterior distributions over γ of the form:π(γ|Y, ψ1, ψ2) =p(Y|γ, ψ2)π(γ|ψ1)Pγp(Y|γ, ψ2)π(γ|ψ1)(4)wherep(Y|γ, ψ2) =Zp(Y|βγ, γ)π(βγ|γ, ψ2) dβγ(5)is the marginal distribution of the data Y given γ and ψ2.To deal with the unknown hyperparameters ψ1and ψ2, we consider two basic approaches:(1) an Empirical Bayes (EB) approach that estimates ψ1and ψ2, based on the data, and thenuses π(γ|Y,ˆψ1,ˆψ2) as the basis for selection, and (2) a Fully Bayes (FB) approach that putspriors on ψ1and ψ2, integrates them out, and then uses π(γ|Y) as as the basis for selection.Note thatπ(γ|Y) =ZZDπ(γ|Y, ψ1, ψ2)π(ψ1, ψ2|Y) dψ1dψ2=ZZDp(Y|γ, ψ2)π(γ|ψ1)p(Y|ψ1, ψ2)·p(Y|ψ1, ψ2)π(ψ1, ψ2)p(Y)dψ1dψ2=ZZDp(Y|γ, ψ2)π(γ|ψ1)p(Y)· π(ψ1, ψ2) dψ1dψ2(6)where p(Y|γ, ψ2) is given by (5), and D is the region of all possible (ψ1, ψ2) values underπ(ψ1, ψ2) on ψ1,ψ2. It is often reasonable to assume ψ1and ψ2are apriori independent, inwhich case π(ψ1, ψ2) = π(ψ1)π(ψ2).Implementation of the EB and FB approaches requires prior forms for both π(βγ|γ, ψ2)and π(γ|ψ1), and for the FB approach, π(ψ1, ψ2) is also needed. Such choices must con-front the difficulty that the integration to obtain p(Y|γ, ψ2) in (5) is analytically intractablefor most GLMs. This computational difficulty has previously been addressed using Laplaceapproximations and Monte Carlo methods (Kass and Raftery, 1995; Raftery, 1996), and bytransformations to the more tractable normal case (Clyde, 1999). In the next section, wepropose general priors for γ and βγ, which when combined with an integrated Laplace ap-proximation to p(Y|γ, ψ2), yield tractable and accurate large sample approximations for (4)and (6).33 GLM ImplementationsFor simplicity, we begin by restricting attention to GLMs with canonical links, in which caseθ = Xβ and the link function is g(·) = b0−1(·). Straightforward extensions for noncanonicallink function will be later described in Section 6. Under a canonical link, the γth model forY in (1) and (2), may be expressed asp (Y|βγ, γ) = exp(YTXγβγ− bT(Xγβγ) · 1φ+ cT(Y, φ) · 1)(7)where b(θ) = (b(θ1), b(θ2), ··· , b(θn))T, c(Y, φ) = (c(y1, φ, ), c(y2, φ), ··· , c(yn, φ))Tand 1 isthe n × 1 vector of all 1’s.For the prior on γ, we follow George and Foster (2000) and


View Full Document

Berkeley COMPSCI 294 - A Hierarchical Bayes Approach to Variable Selection for Generalized Linear Models

Documents in this Course
"Woo" MAC

"Woo" MAC

11 pages

Pangaea

Pangaea

14 pages

Load more
Download A Hierarchical Bayes Approach to Variable Selection for Generalized Linear Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A Hierarchical Bayes Approach to Variable Selection for Generalized Linear Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Hierarchical Bayes Approach to Variable Selection for Generalized Linear Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?