DOC PREVIEW
UW-Madison STAT 333 - AIC

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Statistics 333 Cp, AIC, and BIC Spring 2003There is a discrepancy in R output from the functions step, AIC, and BIC over how to compute the AIC. Thediscrepancy is not very important, because it involves a difference of a constant factor that cancels when using AICor BIC to compare two models. But it might be helpful to understand the differences so that you can compareoutput from these two functions.AIC and BIC are based on the maximum likelihood estimates of the model parameters. In maximum likelihood,the idea is to estimate parameters so that, under the model, the probability of the observed data would be aslarge as possible. The likelihood is this probability, and will always be between 0 and 1. It is common to considerlikelihoods on a log scale. Logarithms of numbers between 0 and 1 are negative, so log-likelihoods are negativenumbers. It is also common to multiply log-likelihoods by −2, for reasons we will not explore.In a regression setting, the estimates of the βibased on least squares and the maximum likelihood estimatesare identical. The difference comes from estimating the common variance σ2of the normal distribution for theerrors around the true means. We have been using the best unbiased estimator of σ2, ˆσ2= RSS/(n − p) wherethere are p parameters for the means (p different βiparameters) and RSS is the residual sum of squares. Thisestimate does not tend to be too large or too small on average. The maximum likelihood estimate, on the otherhand, is RSS/n. This estimate has a slight negative bias, but also has a smaller variance.Putting all of this together, we can write −2 times the log-likelihood to ben + n log(2π) + n log(RSS/n)in a regression setting. Now, AIC is defined to be −2 times the log-likelihood plus 2 times the number of parameters.If there are p different βiparameters, there are a total of p + 1 parameters if we also count σ2. The correct formulafor the AIC for a model with parameters β0, . . . , βp−1and σ2isAIC = n + n log 2π + n log(RSS/n) + 2(p + 1)and the correct formula for BIC isBIC = n + n log 2π + n log(RSS/n) + (log n)(p + 1)This is what the functions AIC and BIC calculate in R. The AIC and BIC formulas in your textbook ignore theleading two terms n + n log 2π and use p instead of p + 1. When comparing AIC or BIC between two models,however, it makes no difference which formula you use because the differences will be the same regardless whichchoice you make.> case1201 = read.table("sleuth/case1201.csv", header = T, sep = ",")> attach(case1201)> keep <- STATE != "Alaska"> x <- data.frame(SAT = SAT[keep], ltakers = log(TAKERS[keep]),+ income = INCOME[keep], years = YEARS[keep], public = PUBLIC[keep],+ expend = EXPEND[keep], rank = RANK[keep])> detach(case1201)> attach(x)Example Computation in R AIC is part of the base package. You can find the BIC using the AIC functionwith the option k = log(n), or, you can load the nonlinear mixed effects library and call the BIC function directly.Here is an example that demonsrates the above ideas.> library(nlme)Bret Larget April 7, 2003Statistics 333 Cp, AIC, and BIC Spring 2003Loading required package: nlsLoading required package: latticeLoading required package: grid> n <- nrow(x)> fit0 <- lm(SAT ~ 1)> summary(fit0)Call:lm(formula = SAT ~ 1)Residuals:Min 1Q Median 3Q Max-158.45 -59.45 19.55 50.55 139.55Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 948.45 10.21 92.86 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 71.5 on 48 degrees of freedom> rss0 <- sum(residuals(fit0)^2)> n + n * log(2 * pi) + n * log(rss0/n) + 2 * 2[1] 560.4736> AIC(fit0)[1] 560.4736> n + n * log(2 * pi) + n * log(rss0/n) + log(n) * 2[1] 564.2573> AIC(fit0, k = log(n))[1] 564.2573> BIC(fit0)[1] 564.2573> fit1 <- lm(SAT ~ ltakers)> summary(fit1)Call:lm(formula = SAT ~ ltakers)Bret Larget April 7, 2003Statistics 333 Cp, AIC, and BIC Spring 2003Residuals:Min 1Q Median 3Q Max-93.328 -21.380 4.154 22.614 50.794Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 1112.408 12.386 89.81 <2e-16 ***ltakers -59.175 4.167 -14.20 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 31.41 on 47 degrees of freedomMultiple R-Squared: 0.811, Adjusted R-squared: 0.807F-statistic: 201.7 on 1 and 47 DF, p-value: < 2.2e-16> rss1 <- sum(residuals(fit1)^2)> n + n * log(2 * pi) + n * log(rss1/n) + 2 * 3[1] 480.832> AIC(fit1)[1] 480.832> n + n * log(2 * pi) + n * log(rss0/n) + log(n) * 3[1] 568.1491> AIC(fit1, k = log(n))[1] 486.5075> BIC(fit1)[1] 486.5075The criteria AIC and BIC will not always lead to the same model. Compare the results from these forwardselections.Forward Selection using AIC> step(lm(SAT ~ 1), SAT ~ ltakers + income + years + public + expend ++ rank, direction = "forward")Start: AIC= 419.42SAT ~ 1Df Sum of Sq RSS AIC+ ltakers 1 199007 46369 340+ rank 1 190297 55079 348Bret Larget April 7, 2003Statistics 333 Cp, AIC, and BIC Spring 2003+ income 1 102026 143350 395+ years 1 26338 219038 416<none> 245376 419+ public 1 1232 244144 421+ expend 1 386 244991 421Step: AIC= 339.78SAT ~ ltakersDf Sum of Sq RSS AIC+ expend 1 20523 25846 313+ years 1 6364 40006 335<none> 46369 340+ rank 1 871 45498 341+ income 1 785 45584 341+ public 1 449 45920 341Step: AIC= 313.14SAT ~ ltakers + expendDf Sum of Sq RSS AIC+ years 1 1248.2 24597.6 312.7+ rank 1 1053.6 24792.2 313.1<none> 25845.8 313.1+ income 1 53.3 25792.5 315.0+ public 1 1.3 25844.5 315.1Step: AIC= 312.71SAT ~ ltakers + expend + yearsDf Sum of Sq RSS AIC+ rank 1 2675.5 21922.1 309.1<none> 24597.6 312.7+ public 1 287.8 24309.8 314.1+ income 1 19.2 24578.4 314.7Step: AIC= 309.07SAT ~ ltakers + expend + years + rankDf Sum of Sq RSS AIC<none> 21922.1 309.1+ income 1 505.4 21416.7 309.9+ public 1 185.0 21737.1 310.7Call:lm(formula = SAT ~ ltakers + expend + years + rank)Coefficients:Bret Larget April 7, 2003Statistics 333 Cp, AIC, and BIC Spring 2003(Intercept) ltakers expend years rank399.115 -38.100 3.996 13.147 4.400Forward Selection using BIC> n <- nrow(x)> step(lm(SAT ~ 1), SAT ~ ltakers + income + years + public + expend ++ rank, direction = "forward", k = log(n))Start: AIC= 421.31SAT ~ 1Df Sum of Sq RSS AIC+ ltakers 1 199007 46369 344+ rank 1 190297 55079 352+ income 1 102026 143350 399+ years 1 26338 219038 420<none> 245376 421+ public 1 1232 244144 425+ expend 1 386 244991 425Step: AIC= 343.56SAT ~ ltakersDf Sum of


View Full Document

UW-Madison STAT 333 - AIC

Download AIC
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view AIC and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view AIC 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?