DOC PREVIEW
UNL STAT 870 - Chapter 10: Building the regression model II: Diagnostics

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Deleted residualsChapter 10: Building the regression model II: Diagnostics10.1 Model adequacy for a predictor variable – added variable plotsReview: 1.Extra sum of squares: Measurement of the reduction of the sums of squares when a predictor variable is added to the model given a set of predictor variables are already in the model. For example, SSR(X2|X1) = SSE(X1) – SSE(X1,X2).2.ei vs. Xik plot (residuals vs. kth predictor variable): Determine the appropriateness of the specified relationship between Xk and Y. For example, if there is arandom scattering of points, the relationship between Xk and Y is specified correctly. If there is a relationship between the points of ei vs. Xik (for example, a quadratic relationship), this suggests Xik is not specified correctly in the model. The problem with #2 is that it does not necessarily give information about the relationship between Xk and Y given allof the predictor variables in the model. Solution: Use added variable (partial regression) plots. Suppose there are only two predictor variables – X1 and X2. The steps to create an added variable plot for X1 are: 2012 Christopher R. Bilder10.11.Find the sample regression model using Y as the response variable and X2 as the predictor variable. Obtain the residuals. Symbolically, i 2 0 1 i2ˆY(X ) b b X= + and i 2 i i 2ˆe (Y | X ) Y Y(X ) . This is like removing the effect of X2 on Y. 2.Find the sample regression model using X1 as the response variable and X2 as the predictor variable. Obtain the residuals. Symbolically, i1 2 0 1 i2ˆX (X ) b b X* *= + and i 1 2 i1 i1 2ˆe (X | X ) X X (X ) . This is like removing the effect of X2 on X1.3.Plot i 2e (Y | X ) vs. i 1 2e (X | X ). If there are more than two predictor variables in the model,then plot i 2 3 p 1e (Y | X ,X ,..., X ) vs. i 1 2 3 p 1e (X | X ,X ,...,X ). In addition, make the appropriate changes to construct addedvariable plots for X2, X3,…, Xp-1.Interpretation:The added variable plot helps to find the correct functional form of a predictor variable in a multiple regression model. Suppose the only predictor variables are X1 and X2. Below are example of added variable plots:  2012 Christopher R. Bilder10.20Given X2in the model, X1does not give any additional information about Ye(Y|X2)e(X1|X2)0Given X2in the model, X1has a linear relationship with Ye(Y|X2)e(X1|X2)0Given X2in the model, X1has a curvature relationship with Ye(Y|X2)e(X1|X2)Notes: 1.Notice how interpreting these plots is somewhat differentfrom interpreting plots of ei vs. Xik. 2.Remember what a t-test does – it tests the linear relationship between Y and Xk given all of the other variables in the model. Therefore, the added variable plots and t-tests partially give the same information. With the added variable plots, there is also information about the type of relationship between Y and Xk. 2012 Christopher R. Bilder10.33.The added variable plots are dependent on which predictor variables are present in a model. 4.See Figure 10.2 of KNN for an additional interpretation of added variable plots. This further helps to relate added variable plots to extra sum of squares.5.Added variable plots also help to identify outliers and “leverage” or influential points (more on this later). 6.Fox (2002) also discusses “component + residual” plots, which are similar to added variable plots. The y-axis is an approximation to the y-axis for the added variable plot. The x-axis is the unadjusted variable of interest. See p. 210 of Fox’s book for more information. Example: NBA guard data (nba_ch10.R)From using the model selection procedures in Chapter 9,the “best” model so far includes the variables MPG, Height, FGP, and Age. Next, we want to determine if changes to the predictor variables need to be made. The av.plots() function in the car package automatically produces these plots. > nba<-read.table(file = "C:\\chris\\UNL\\STAT870\\Chapter6\\nba_data.txt", header=TRUE, sep = "")> head(nba) last.name first.initial games PPM MPG height FTP FGP age1 Abdul-Rauf M. 80 0.5668 33.8750 185 93.5 45.0 242 Adams M. 69 0.4086 36.2174 178 85.6 43.9 303 Ainge D. 81 0.4419 26.7037 196 84.8 46.2 344 Anderson K. 55 0.4624 36.5455 185 77.6 43.5 23 2012 Christopher R. Bilder10.45 Anthony G. 70 0.2719 24.2714 188 67.3 41.5 266 Armstrsong B.J. 81 0.3998 30.7654 188 86.1 49.9 26> mod.fit<-lm(formula = PPM ~ MPG + height + FGP + age, data = nba)> library(car)> avPlots(model = mod.fit)-20 -10 0 10-0.2 0.0 0.1 0.2 0.3MPG | othersPPM | others-30 -20 -10 0 10-0.2 0.0 0.1 0.2 0.3height | othersPPM | others-5 0 5 10 15-0.2 0.0 0.2 0.4FGP | othersPPM | others-5 0 5 10-0.2 0.0 0.1 0.2 0.3age | othersPPM | othersAdded-Variable PlotsThe solid line is a simple linear regression model for the y and x-axis values. Plots discussion: 2012 Christopher R. Bilder10.51) (1,1): At least a linear relationship, maybe (?) a quadratic relationship. 2) (1,2): A linear relationship 3) (2,1): A linear relationship 4) (2,2): Possibly a linear relationship (maybe not muchof one at all) 10.2 Identifying outlying Y observations – studentized deleted residuals When there is only 1 predictor variable, identifying outliers is not too difficult. Below is a partial reproductionof Figure 10.5 in KNN: Scatter plot showing outliersYX1231.Outlying point with respect to its Y value. May not be very influential to the regression model fit since there are similar X values. 2012 Christopher R. Bilder10.62.Outlying point with respect to its X and Y value. May not be very influential to the regression model fit since the Y value is consistent with the others.3.Outlying point with respect to its X value. May be influential since the X value is outlying and not consistent with respect to the other X values. In multiple regression, we generally can not look at plots as shown above (too many dimensions). Therefore, we need to examine numerical measures that give information about a particular observation being outlying or not. Residuals and semistudentized residuals (Chapters 1 and 3)ii i i ieˆe Y Y and eMSE  Remember that MSE is not quite the estimated variance of ei. Thus, ie is not quite a random variable with variance of 1. Hat matrix (Chapters 5 and 6)Remember


View Full Document

UNL STAT 870 - Chapter 10: Building the regression model II: Diagnostics

Download Chapter 10: Building the regression model II: Diagnostics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 10: Building the regression model II: Diagnostics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 10: Building the regression model II: Diagnostics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?