IV Section 14 4 Goodness of Fit Measures A The sample regression equation by itself does not tell us how well we can predict some outcome Note If all points fall on a regression line or regression surface then there is a perfect fit However this never happens so we explore the fit B Three Goodness of Fit measures 1 Standard Error of the Estimate se 2 Coefficient of Determination R2 3 Adjusted R2 C Standard Error of the Estimate se The standard error of the estimate measures the dispersion of the points around the estimated line It measures the spread of the Y s around the values the variation in or e2 1 Let s first consider VARIANCE of e Remember Where variance MS dfs Variance of Error Where SSE measures the error in prediction using Remember This is what we re trying to minimize by using OLS knowing X 2 The std error of the estimate se for two If points are all on the line then the errors are zero MSE 0 and 0 Also note that predictors which means that model 1 has a better fit than model 2 This suggests that model 1 with 1 predictor is better than model 2 with 2 predictors D The Coefficient of Determination R2 Rationale We need to consider THREE Sums of Squares 1 SST Sums of Squares Total For any observation i e Metro City without knowledge of X Median Income what is the best estimate of Y Debt Answer the average of Y SST 388 182 462 numerator of SST 388 182 462 a numeric index that measures the error in prediction by using the mean of Y WITHOUT X USE 2 SSE Sums of Squares Error SSE Numerator of MSE 96 045 553 a numeric index that measures the error in prediction by using WITH X 3 SSR an numeric index that mesures the improvement in prediction by using X SSR SST SSE 388 182 462 96 045 553 292 136 909 USE this to get SSE SST SSR SSE USE 4 Relative Improvement R2 a R2 ranges from 0 to 1 inclusive Note R2 is positive for both positive negative relationships b R2 1 implies perfect prediction that is all points lie exactly on the estimated regression line c R2 0 implies no predictive power d NOTE E Adjusted R2 measure of fit adjusted for comparing models with different number of predictors 1 Note that as the number of predictors k increases the value of R2 will increase from 7526 until it reaches 1 0 even if the predictors are poor 2 So R2 is a biased measure of fit Therefore we must adjust the measure to eliminate the bias 3 R2 adj is a better estimate of 2 4 Formulae Summary of statistics covered in Chapter 14 where to get SST SSR SSE SSE SST SSR
View Full Document