DOC PREVIEW
UNL STAT 870 - Chapter 8 Regression models for quantitative and qualitative predictors

This preview shows page 1-2-3-23-24-25-26-46-47-48 out of 48 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 8: Regression models for quantitative and qualitative predictors 8.1 Polynomial regression modelsFirst-order model: E(Yi) = 0 + 1Xi1 = 0 + 11i1XXYSecond-order model: E(Yi) = 0 + 1Xi1 + 22i1X XY 2 < 0XY 2 > 02i1X is called the second-order or “quadratic” term of the model. It allows for curvature in the relationship between X and Y. 2012 Christopher R. Bilder8.1The sign of 2 determines if the curve opens upwards or downwards.Since 2i1X is a transformation of Xi1, these two model terms can be highly correlated leading to multicollinearity and problems with inverting XX. To partially avoid this, the predictor variable can be transformed to be deviations from its mean, Zi1=Xi1-1X. The second order model becomes, E(Yi) = 0 + 1Zi1 + 22i1Z. Notes:  KNN use a lowercase “script X” for Z here.  Usually, I will only use this transformation when I see signsof multicollinearity and its effects of inferences. For example, if I see a VERY large standard error for a model coefficient when the transformation is not made, I will examine if this still occurs when the transformation is made. Example: NBA guard data (nba_ch8.R)Examine the age variable.Some basketball fans may believe:1) Younger guards are learning the game and do not perform as well as more experienced guards2) Older guards performance decreases past a certain age 2012 Christopher R. Bilder8.23) Guards reach a peak performance level in their late 20’s.In the above statements were true, one might expect to see a plot something like:AgePPM 2 < 0Early 20’s Mid 30’sLate 20’sConsider the model E(Yi) = 0 + 1Xi1 + 22i1X where Yi=PPM and Xi1=Age. Also, consider the model E(Yi) = 0 + 1Zi1 + 22i1Z where Yi=PPM and Zi1=Xi1-1X. NOTE: The 's are NOT the same for the two different models! I used the "" notation to be consistent with the notation used before. If you are uncomfortable with this notation,  could be substituted for  in the second model. Fit the models and determine if Age2 should be in the model. R code and output:> nba<-read.table(file =  2012 Christopher R. Bilder8.3"C:\\chris\\UNL\\STAT870\\Chapter6\\nba_data.txt", header=TRUE, sep = "") > mod.fit1<-lm(formula = PPM ~ age + I(age^2), data = nba)> summary(mod.fit1)Call:lm(formula = PPM ~ age + I(age^2), data = nba)Residuals: Min 1Q Median 3Q Max -0.255059 -0.083069 -0.001772 0.058228 0.396231 Coefficients: Estimate Std. Error t value Pr(>|t|)(Intercept) -0.6760076 0.7220455 -0.936 0.351age 0.0802913 0.0514092 1.562 0.121I(age^2) -0.0014443 0.0009053 -1.595 0.114Residual standard error: 0.1155 on 102 degrees of freedomMultiple R-Squared: 0.02634, Adjusted R-squared: 0.007247 F-statistic: 1.38 on 2 and 102 DF, p-value: 0.2563 > mod.fit2<-lm(formula = PPM ~ I(age-mean(age)) + I((age- mean(age))^2), data = nba)> summary(mod.fit2)Call:lm(formula = PPM ~ I(age - mean(age)) + I((age - mean(age))^2), data = nba)Residuals: Min 1Q Median 3Q Max -0.255059 -0.083069 -0.001772 0.058228 0.396231 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.4397844 0.0151822 28.967 <2e-16 ***I(age - mean(age)) 0.0007589 0.0036605 0.207 0.836 I((age - mean(age))^2) -0.0014443 0.0009053 -1.595 0.114 ---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1  2012 Christopher R. Bilder8.4Residual standard error: 0.1155 on 102 degrees of freedomMultiple R-Squared: 0.02634, Adjusted R-squared: 0.007247 F-statistic: 1.38 on 2 and 102 DF, p-value: 0.2563 > mean(nba$age)[1] 27.53333> mod.fit2$coefficients (Intercept) I(age - mean(age)) I((age - mean(age))^2) 0.4397843689 0.0007589219 -0.0014442934 > cor(x = nba$age, y = nba$age^2)[1] 0.9978604> cor(x = nba$age - mean(nba$age), y = (nba$age- mean(nba$age))^2)[1] 0.3960341 > plot(x = nba$age, y = nba$PPM, xlab = "Age", ylab = "PPM", main = "PPM vs. Age", panel.first = grid(col = "gray", lty = "dotted"))> curve(expr = predict(object = mod.fit1, newdata = data.frame(age = x)), col = "red", lty = "solid", lwd = 1, add = TRUE, from = min(nba$age), to = max(nba$age))25 30 350.2 0.3 0.4 0.5 0.6 0.7 0.8PPM vs. AgeAgePPM 2012 Christopher R. Bilder8.5Notes: 1) Notice how the I() (identity) function is used in the formula statements of lm(). The I() function helps to protect the meaning of what is inside of it. Note that just saying age^2 without the function will not work properly. We will see later on that syntax like (var1 + var2)^2 means to include var1, var2, and var1var2 inthe model (all “main effects” and “interactions”). Thus,age^2 just means age to R because there are no other terms with it. 2) The scale() function can also be used to find the mean-adjusted values. See example in program.3) There is strong positive correlation between age and age2. There is not as strong of correlation between the mean adjusted age terms. 4) The sample regression model using the mean adjusted age variable is: 222ˆY 0.4397843689 0.0007589219Z 0.0014442934Z0.4397843689 0.0007589219(X 27.53) 0.0014442934(X 27.53)0.6760076 0.08029134X 0.0014442934X= + -= + - - -=- + -Therefore, the model using the transformed X (Z) is the same as just using X. 5) The overall F test has a p-value of 0.2563 for both models indicating there is not a significant relationshipbetween PPM and age, age2. If the overall F test rejected H0, then a test for age2 would be appropriate to determine if there is a quadratic relationship between age and PPM.  2012 Christopher R. Bilder8.66) For illustrative purposes, the p-value for testing H0:2 = 0 vs. Ha:20 is 0.114 for both models. This would indicate there is marginal evidence that age2 is needed. 7) Given we received the same p-values for the overall F-test (shown in 5) and both models are the same (shown in 4), one does not need to worry about potential problems with using non-transformed predictor variables. 8) The scatter plot with the sample regression model does not show a strong relationship between PPM and age. 9) The interpretation of the bj values can not be done thesame way as before (i.e., for every one unit increase in Xj, ˆY increases by bj). The age term itself is not interpreted


View Full Document

UNL STAT 870 - Chapter 8 Regression models for quantitative and qualitative predictors

Download Chapter 8 Regression models for quantitative and qualitative predictors
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 8 Regression models for quantitative and qualitative predictors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 8 Regression models for quantitative and qualitative predictors 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?