DOC PREVIEW
PSU STAT 501 - Model Checking

This preview shows page 1-2-3-4-29-30-31-32-33-60-61-62-63 out of 63 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Model CheckingWhere does this topic fit in?The simple linear regression modelSlide 4Why do we have to check our model?When should we worry most?Slide 7What can go wrong with the model?The basic idea of residual analysisDistinction between true errors i and residuals eiA residuals vs. fits plotExample: Alcoholism and muscle strength?A well-behaved residuals vs. fits plotCharacteristics of a well-behaved residual vs. fits plotA residuals vs. predictor plotA residuals vs. predictor plot offering nothing new.Example: What are good predictors of blood pressure?Regression of BP on AgeRegression of BP on WeightRegression of BP on DurationResiduals (age only) vs. weight plotResiduals (weight only) vs. age plotResiduals (age, weight) vs. duration plotHow a non-linear function shows up on a residual vs. fits plotExample: A linear relationship between tread wear and mileage?Is tire tread wear linearly related to mileage?A residual vs. fits plot suggesting relationship is not linearHow non-constant error variance shows up on a residual vs. fits plotExample: How is plutonium activity related to alpha particle counts?A residual vs. fits plot suggesting non-constant error varianceHow an outlier shows up on a residuals vs. fits plotExample: Relationship between tobacco use and alcohol use?Slide 33A residual vs. fits plot suggesting an outlier existsOne data point can greatly affect the value of R2How large does a residual need to be before being flagged?Standardized residuals vs. fits plotMinitab identifies observations with large standardized residualsAnscombe data set #3A residual vs. fits plot suggesting an outlier existsResiduals vs. order plotNormal random noiseA time trendPositive serial correlationSlide 45Negative serial correlationSlide 47Normal (probability) plot of residualsSlide 49Normal (probability) plot of residuals (cont’d)Normal (probability) plotSlide 52Slide 53Normal residualsNormal probability plotNormal residuals but with one outlierSlide 57Skewed (positive) residualsSlide 59Heavy-tailed residualsSlide 61Residual plots in Minitab’s regression commandNormal plots outside of Minitab’s regression commandModel CheckingUsing residuals to check the validity of the linear regression model assumptionsWhere does this topic fit in?•Model formulation•Model estimation•Model evaluation•Model useThe simple linear regression model•The mean of the responses, E(Yi), is a linear function of the xi.•The errors, εi, and hence the responses Yi, are independent.•The errors, εi, and hence the responses Yi, are normally distributed.•The errors, εi, and hence the responses Yi, have equal variances (σ2) for all x values.The simple linear regression model iiiXY10with the independent error terms i following a normal distribution with mean 0 and equal variance 2.Assume (!!) response is linear function of trend and error:Why do we have to check our model?•All estimates, intervals, and hypothesis tests have been developed assuming that the model is correct.•If the model is incorrect, then the formulas and methods we use are at risk of being incorrect.When should we worry most?•All tests and intervals are very sensitive to–departures from independence.–moderate departures from equal variance.•Tests and intervals for β0 and β1 are fairly robust against departures from normality.•Prediction intervals are quite sensitive to departures from normality.When should we worry most?•The severity of the consequences is always related to the severity of the violation.•Data analysis is a science (objective tools!) based on art (subjective decisions!).•Worry just the right amount. Don’t overworry.What can go wrong with the model?•Regression function is not linear.•Error terms are not independent.•Error terms are not normal.•Error terms do not have equal variance.•The model fits all but one or a few outlier observations.•An important predictor variable has been left out of the model.The basic idea of residual analysisThe observed residuals:iiiyyeˆshould reflect the properties assumed for the unknown true error terms: iiiYEY So, investigate the observed residuals to see if they behave “properly.”Distinction between true errors i and residuals ei1 2 3 4 5610141822High school gpaCollege entrance test score xYEY 10xbby10ˆA residuals vs. fits plot•A scatter plot with residuals on the y axis and fitted values on the x axis.•Helps to identify non-linearity, outliers, and non-constant variance.40302010 0302010alcoholstrengthS = 3.87372 R-Sq = 41.2 % R-Sq(adj) = 39.9 %strength = 26.3695 - 0.295868 alcoholRegression PlotExample: Alcoholism and muscle strength?A well-behaved residuals vs. fits plot25201550-5-10Fitted ValueResidualResiduals Versus the Fitted Values(response is strength)Residuals for model with strength as response and alcohol as predictor.Characteristics of a well-behaved residual vs. fits plot•The residuals “bounce randomly” around the 0 line. (Linear is reasonable).•No one residual “stands out” from the basic random pattern of residuals. (No outliers).•The residuals roughly form a “horizontal band” around 0 line. (Constant variance).A residuals vs. predictor plot•A scatter plot with residuals on the y axis and the values of a predictor on the x axis.•If the predictor on the x axis is the same predictor used in model, offers nothing new. •If the predictor on the x axis is a new and different predictor, can help to determine whether the predictor should be added to model.A residuals vs. predictor plot offering nothing new.0 10 20 30 40-10-505alcoholResidualResiduals Versus alcohol(response is strength)Residuals for model with strength as response and alcohol as predictor.Example: What are good predictors of blood pressure?•n = 20 hypertensive individuals•age = age of individual•weight = weight of individual•duration = years with high blood pressureRegression of BP on Age45 50 55105115125AgeBPBP = 44.4545 + 1.43098 AgeS = 4.19480 R-Sq = 43.4 % R-Sq(adj) = 40.3 %Regression PlotRegression of BP on Weight100 95 90 85125115105WeightBPS = 1.74050 R-Sq = 90.3 % R-Sq(adj) = 89.7 %BP = 2.20531 + 1.20093 WeightRegression PlotRegression of BP on Duration10 9 8 7 6 5 4 3 2125115105DurationBPS = 5.33322 R-Sq = 8.6 % R-Sq(adj) = 3.5 %BP = 109.235 + 0.741063 DurationRegression PlotResiduals (age only) vs. weight plot85


View Full Document

PSU STAT 501 - Model Checking

Documents in this Course
VARIABLES

VARIABLES

33 pages

Load more
Download Model Checking
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Model Checking and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Model Checking 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?