UW-Madison STAT 849 - Clarification - D2437262

Home> Schools> University of Wisconsin, Madison> Statistics (STAT) > STAT 849> Clarification

DOC PREVIEW

UW-Madison STAT 849 - Clarification

School name University of Wisconsin, Madison

Course Stat 849- Theory and Application of Regression and Analysis of Variance I

Pages 6

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Statistics 849 Clarification 2010-11-03 (p. 1)What I should have described in class todaySorry for botching the description of the deviance calculation in a generalized linear model and howit relates to the function dev.resids in a glm family object in R.Once again Chen Zuo pointed out the part that I had missed.We need to use positive contributions to the deviance for each observation if we are later to takethe square roots of these quantities. To ensure that these are all positive we establish a baseline levelfor the deviance, which is the lowest possible value of the deviance, corresponding to a saturated modelin which each observation has one parameter associated with it.Another value quoted in the summary output is the null deviance, which is the deviance for thesimplest possible model (the “null model”) corresponding to a formula like> data(Contraception, package = "mlmRev")> summary(cm0 <- glm(use ~ 1, binomial, Contraception))Call:glm(formula = use ~ 1, family = binomial, data = Contraception)Deviance Residuals:Min 1Q Median 3Q Max-0.9983 -0.9983 -0.9983 1.3677 1.3677Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -0.43702 0.04657 -9.385 <2e-16(Dispersion parameter for binomial family taken to be 1)Null deviance: 2590.9 on 1933 degrees of freedomResidual deviance: 2590.9 on 1933 degrees of freedomAIC: 2592.9Number of Fisher Scoring iterations: 4Notice that the coefficient estimate, -0.437022, is the logit transformation of the proportion of positiveresponses> str(y <- as.integer(Contraception[["use"]]) - 1L)int [1:1934] 0 0 0 0 0 0 0 0 0 0 ...> mean(y)[1] 0.3924509> qlogis(mean(y))[1] -0.4370216> all.equal(qlogis(mean(y)), coef(cm0), check.attr = FALSE)[1] TRUEIn the Poisson model, the deviance isd(β|y) = −2 log(p(y|µ)) = −2nXi=1(log(yi!) − µi+ yilog(µi)) .Statistics 849 Clarification 2010-11-03 (p. 2)The deviance for the saturated model is this expression evaluated at µ = y. The difference betweenthe deviance for the given model (and parameters) versus that of the saturated model deviance isd(β|y) − dS(y) = −2nXi=1(log(yi!) − µi+ yilog(µi) − log(yi!) + yi− yilog(yi))= 2nXi=1(yi(log(yi/µi)) − (yi− µi))which is what is evaluated in> poisson()$dev.residsfunction (y, mu, wt)2*wt*(y*log(ifelse(y == 0, 1, y/mu)) - (y - mu))<environment: 0x1d37da0>For the binomial family, the dev.resids function calls a compiled function> binomial()$dev.residsfunction (y, mu, wt).Call("binomial_dev_resids", y, mu, wt, PACKAGE = "stats")<environment: 0x22edb20>the crux of which is also based on an expression like log(ifelse(y == 0, 1, y/mu)). In C it looks likedouble y_log_y(double y, double mu){return (y) ? (y*log(y/mu)) : 0;}for(i = 0, i < n; i++)dr[i] = 2*wt[i]*(y_log_y(y[i], mu[i]) + y_log_y(1-y[i], 1-mu[i]))Recall that the log-likelihood for the Bernoulli distribution islog(p(y|µ)) = (1 − y) log(1 − µ) + y log µ 0 < µ < 1, y = 0, 1.so the deviance, for a collection of binary responses isd(β|y) = −2nXi=1((1 − yi) log(1 − µi) + yilog(µi)) , 0 < µ < 1, y = 0, 1.The deviance for the saturated model isdS(y) = −2nXi=1((1 − yi) log(1 − yi) + yilog(yi))and the difference isd(β|y) − dS(y) = 2nXi=1(1 − yi) log1 − yi1 − µi+ yilog(yi/µi)taking into account that limyi→0yilog(yi/µi) = 0 (use l’Hˆopital’s rule or just experiment numerically).Statistics 849 Clarification 2010-11-03 (p. 3)> xx <- 10^(-(10:20))> cbind(xx, xx*log(xx))xx[1,] 1e-10 -2.302585e-09[2,] 1e-11 -2.532844e-10[3,] 1e-12 -2.763102e-11[4,] 1e-13 -2.993361e-12[5,] 1e-14 -3.223619e-13[6,] 1e-15 -3.453878e-14[7,] 1e-16 -3.684136e-15[8,] 1e-17 -3.914395e-16[9,] 1e-18 -4.144653e-17[10,] 1e-19 -4.374912e-18[11,] 1e-20 -4.605170e-19In the case of a binomial distribution the mean, µi, is taken to be the probability of a positiveresponse and the observed data, yi, are the proportions of observed positive responses. The priorweights, wi, are the number of observations, ni. Then the likelihood islog(p(y|µ)) = lognny+ n [(1 − y) log(1 − µ) + y log µ] , 0 < µ < 1, 0 ≤ y ≤ 1and a similar derivation shows that this provides the difference d(β|y, w) − dS(y, w) is as above.In the model we fit in class> Contraception <- within(Contraception, ch <- factor(livch !=+ 0, labels = c("N", "Y")))> summary(cm3 <- glm(use ~ age*ch + urban + I(age^2), binomial,+ Contraception))Call:glm(formula = use ~ age*ch + urban + I(age^2), family = binomial,data = Contraception)Deviance Residuals:Min 1Q Median 3Q Max-1.4863 -1.0203 -0.6777 1.2206 2.0859Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) -1.2593975 0.1985853 -6.342 2.27e-10age -0.0483951 0.0212105 -2.282 0.02251chY 1.1551578 0.2008875 5.750 8.91e-09urbanY 0.7891625 0.1066433 7.400 1.36e-13I(age^2) -0.0054347 0.0008073 -6.732 1.68e-11age:chY 0.0680242 0.0246991 2.754 0.00589(Dispersion parameter for binomial family taken to be 1)Null deviance: 2590.9 on 1933 degrees of freedomResidual deviance: 2409.4 on 1928 degrees of freedomAIC: 2421.4Number of Fisher Scoring iterations: 4the fitted values on the probability scale (i.e. the scale of µ) areStatistics 849 Clarification 2010-11-03 (p. 4)> str(fitted(cm3))Named num [1:1934] 0.31 0.409 0.669 0.614 0.307 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...> str(dr <- binomial()$dev.resids(y, fitted(cm3), wt = rep(1, length(y))))num [1:1934] 0.742 1.051 2.209 1.903 0.734 ...> sum(dr)[1] 2409.377To get the fitted values on other scales, such as the linear predictor scale, we use> str(linpred <- predict(cm3, type = "link"))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...If we know that we are using the logit link we could evaluate this as> str(qlogis(fitted(cm3)))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...or, if we didn’t know which link we used we could evaluate it as> str(cm3$family$linkfun(fitted(cm3)))Named num [1:1934] -0.801 -0.369 0.702 0.463 -0.813 ...- attr(*, "names")= chr [1:1934] "1" "2" "3" "4" ...because the family argument is stored in the fitted model object.Most of the time the deviance of the saturated model is zeroHaving gone to all the trouble of defining the deviance with respect to the saturated model it turnsout that most of the time the deviance of the saturated model is zero, and there is no need for thecorrection. In the usual cases it differs from zero only for a binomial model that

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

UW-Madison STAT 849 - Clarification

Sign up for free to view:

Please select your school