DOC PREVIEW
UW-Madison STAT 849 - Clarification

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Statistics 849 Clarification 2010-11-03 (p. 1)What I should have described in class todaySorry for botching the description of the deviance calculation in a generalized linear model and howit relates to the function dev.resids in a glm family object in R.Once again Chen Zuo pointed out the part that I had missed.We need to use positive contributions to the deviance for each observation if we are later to takethe square roots of these quantities. To ensure that these are all positive we establish a baseline levelfor the deviance, which is the lowest possible value of the deviance, corresponding to a saturated modelin which each observation has one parameter associated with it.Another value quoted in the summary output is the null deviance, which is the deviance for thesimplest possible model (the “null model”) corresponding to a formula like> data(Contraception, package = "mlmRev")> summary(cm0 <- glm(use ~ 1, binomial, Contraception))Call:glm(formula = use ~ 1, family = binomial, data = Contraception)Deviance Residuals:Min 1Q Median 3Q Max-0.9983 -0.9983 -0.9983 1.3677 1.3677Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -0.43702 0.04657 -9.385 <2e-16(Dispersion parameter for binomial family taken to be 1)Null deviance: 2590.9 on 1933 degrees of freedomResidual deviance: 2590.9 on 1933 degrees of freedomAIC: 2592.9Number of Fisher Scoring iterations: 4Notice that the coefficient estimate, -0.437022, is the logit transformation of the proportion of positiveresponses> str(y <- as.integer(Contraception[["use"]]) - 1L)int [1:1934] 0 0 0 0 0 0 0 0 0 0 ...> mean(y)[1] 0.3924509> qlogis(mean(y))[1] -0.4370216> all.equal(qlogis(mean(y)), coef(cm0), check.attr = FALSE)[1] TRUEIn the Poisson model, the deviance isd(β|y) = −2 log(p(y|µ)) = −2nXi=1(log(yi!) − µi+ yilog(µi)) .Statistics 849 Clarification 2010-11-03 (p. 2)The deviance for the saturated model is this expression evaluated at µ = y. The difference betweenthe deviance for the given model (and parameters) versus that of the saturated model deviance isd(β|y) − dS(y) = −2nXi=1(log(yi!) − µi+ yilog(µi) − log(yi!) + yi− yilog(yi))= 2nXi=1(yi(log(yi/µi)) − (yi− µi))which is what is evaluated in> poisson()$dev.residsfunction (y, mu, wt)2*wt*(y*log(ifelse(y == 0, 1, y/mu)) - (y - mu))<environment: 0x1d37da0>For the binomial family, the dev.resids function calls a compiled function> binomial()$dev.residsfunction (y, mu, wt).Call("binomial_dev_resids", y, mu, wt, PACKAGE = "stats")<environment: 0x22edb20>the crux of which is also based on an expression like log(ifelse(y == 0, 1, y/mu)). In C it looks likedouble y_log_y(double y, double mu){return (y) ? (y*log(y/mu)) : 0;}for(i = 0, i < n; i++)dr[i] = 2*wt[i]*(y_log_y(y[i], mu[i]) + y_log_y(1-y[i], 1-mu[i]))Recall that the log-likelihood for the Bernoulli distribution islog(p(y|µ)) = (1 − y) log(1 − µ) + y log µ 0 < µ < 1, y = 0, 1.so the deviance, for a collection of binary responses isd(β|y) = −2nXi=1((1 − yi) log(1 − µi) + yilog(µi)) , 0 < µ < 1, y = 0, 1.The deviance for the saturated model isdS(y) = −2nXi=1((1 − yi) log(1 − yi) + yilog(yi))and the difference isd(β|y) − dS(y) = 2nXi=1(1 − yi) log1 − yi1 − µi+ yilog(yi/µi)taking into account that limyi→0yilog(yi/µi) = 0 (use l’Hˆopital’s rule or just experiment numerically).Statistics 849 Clarification 2010-11-03 (p. 3)> xx <- 10^(-(10:20))> cbind(xx, xx*log(xx))xx[1,] 1e-10 -2.302585e-09[2,] 1e-11 -2.532844e-10[3,] 1e-12 -2.763102e-11[4,] 1e-13 -2.993361e-12[5,] 1e-14 -3.223619e-13[6,] 1e-15 -3.453878e-14[7,] 1e-16 -3.684136e-15[8,] 1e-17 -3.914395e-16[9,] 1e-18 -4.144653e-17[10,] 1e-19 -4.374912e-18[11,] 1e-20 -4.605170e-19In the case of a binomial distribution the mean, µi, is taken to be the probability of a positiveresponse and the observed data, yi, are the proportions of observed positive responses. The priorweights, wi, are the number of observations, ni. Then the likelihood islog(p(y|µ)) = lognny+ n [(1 − y) log(1 − µ) + y log µ] , 0 < µ < 1, 0 ≤ y ≤ 1and a similar derivation shows that this provides the difference d(β|y, w) − dS(y, w) is as above.In the model we fit in class> Contraception <- within(Contraception, ch <- factor(livch !=+ 0, labels = c("N", "Y")))> summary(cm3 <- glm(use ~ age*ch + urban + I(age^2), binomial,+ Contraception))Call:glm(formula = use ~ age*ch + urban + I(age^2), family = binomial,data = Contraception)Deviance Residuals:Min 1Q Median 3Q Max-1.4863 -1.0203 -0.6777 1.2206 2.0859Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -1.2593975 0.1985853 -6.342 2.27e-10age -0.0483951 0.0212105 -2.282 0.02251chY 1.1551578 0.2008875 5.750 8.91e-09urbanY 0.7891625 0.1066433 7.400 1.36e-13I(age^2) -0.0054347 0.0008073 -6.732 1.68e-11age:chY 0.0680242 0.0246991 2.754 0.00589(Dispersion parameter for binomial family taken to be 1)Null deviance: 2590.9 on 1933 degrees of freedomResidual deviance: 2409.4 on 1928 degrees of freedomAIC: 2421.4Number of Fisher Scoring iterations: 4the fitted values on the probability scale (i.e. the scale of µ) areStatistics 849 Clarification 2010-11-03 (p. 4)> str(fitted(cm3))Named num [1:1934] 0.31 0.409 0.669 0.614 0.307 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...> str(dr <- binomial()$dev.resids(y, fitted(cm3), wt = rep(1, length(y))))num [1:1934] 0.742 1.051 2.209 1.903 0.734 ...> sum(dr)[1] 2409.377To get the fitted values on other scales, such as the linear predictor scale, we use> str(linpred <- predict(cm3, type = "link"))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...If we know that we are using the logit link we could evaluate this as> str(qlogis(fitted(cm3)))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...or, if we didn’t know which link we used we could evaluate it as> str(cm3$family$linkfun(fitted(cm3)))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...because the family argument is stored in the fitted model object.Most of the time the deviance of the saturated model is zeroHaving gone to all the trouble of defining the deviance with respect to the saturated model it turnsout that most of the time the deviance of the saturated model is zero, and there is no need for thecorrection. In the usual cases it differs from zero only for a binomial model that


View Full Document

UW-Madison STAT 849 - Clarification

Download Clarification
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Clarification and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Clarification 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?