DOC PREVIEW
UW-Madison STAT 572 - Checking Models

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Checking Models Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison April 3 2008 1 13 Summary of Chapter 8 Chapter 8 examines simulating data sets to test models In sections 8 1 8 2 the focus is on testing the statistical procedures when the true model is known We do this by comparing estimated coefficients with known values In sections 8 3 8 4 the focus is on testing the goodness of fit of the model to the real data We examine this by comparing simulated fake data to real data Introduction 2 13 Checking Confidence Intervals Do confidence intervals made by the estimate plus minus two SEs contain the true value 95 of the time as advertised when the model is correct We will examine this for models similar to the samara example and the runoff example The procedure is as follows 1 2 3 4 5 Select a true model and parameter values Generate a fake data set Estimate the model parameters using only the fake data Find confidence intervals for the estimates Repeat many times and compare to reference values Fake Data Simulation Checking Confidence Intervals 3 13 Selecting a true model Say we have two groups of individuals with sample sizes 4 30 120 and 3 30 90 A response y has the same slope 70 for both groups The A group has an intercept of 30 The S group has an intercept of 70 Both groups have x that is normal with a mean of 1 7 and a standard deviation of 0 3 The error standard deviation is 10 Fake Data Simulation Checking Confidence Intervals 4 13 R Code for Generating Data make fake function n1 120 n2 90 x rnorm n1 n2 1 7 0 3 group factor c rep A n1 rep S n2 mu rep NA n1 n2 mu group A 30 mu group S 70 y mu 70 x rnorm n1 n2 0 10 return data frame y y x x group group Fake Data Simulation Checking Confidence Intervals 5 13 R Code to Check Intervals getIntervals function fit estimates coef fit ses se coef fit intervals rbind estimates 2 ses estimates 2 ses return intervals Fake Data Simulation Checking Confidence Intervals 6 13 Checking the Intervals check function fit true estimates intervals getIntervals fit answer true estimates intervals 1 true estimates intervals 2 return answer doChecks function n true estimates answer matrix NA n 3 for i in 1 n fake make fake fit lm y x group data fake answer i check fit true estimates return answer Fake Data Simulation Checking Confidence Intervals 7 13 Numerical Estimates Notice that the observed times that the confidence intervals contain the true values are very close to the theoretical value of 95 out doChecks 1000 c 30 70 100 apply out 2 sum 1000 1 0 954 0 954 0 955 Fake Data Simulation Checking Confidence Intervals 8 13 Residual Plots We look for patterns in residual plots to check model goodness of fit We can train our eye to recognize normal variation and patterns by generating data that fits a model and looking at those residual plots We will illustrate this in R with the FEV data set 654 kids with ages ranging from 3 to 19 FEV is forced expiratory volume a measure of lung capacity and is measured in liters Age is measured in years Smoking has two levels nonsmoker smoker Consider a model to predict FEV with Age and Smoking as inputs without any transformations Fake Data Simulation Checking Residual Plots 9 13 Residual Plots The next plot shows a residual plot of the real data and residual plots for five simulated data sets from the fitted model The real data is in the upper left Notice that the real data shows a wedge shape consistent with non constsant variance The simulated data has normal error and fits the model Fake Data Simulation Checking Residual Plots 10 13 1 2 3 0 1 4 1 1 0 1 1 2 3 4 fitted fake 3 Fake Data Simulation 2 3 1 0 1 4 1 fitted fake 1 residuals fake 4 residuals fake 3 fitted fev lm1 residuals fake 2 0 1 1 1 0 1 1 2 3 4 fitted fake 4 Checking Residual Plots 2 3 4 fitted fake 2 residuals fake 5 1 residuals fake 1 residuals fev lm1 Non transformed Data 1 0 1 1 2 3 4 fitted fake 5 11 13 Residual Plots for Transformed Data The next plot shows a residual plot of the real data and residual plots for five simulated data sets from the fitted model The real data is in the upper left The real data has had log transformations to both the outcome fev and the predictor age The shape of the residual plot is quite similar to the simulated data sets Fake Data Simulation Checking Residual Plots 12 13 0 0 0 5 1 0 0 5 0 0 0 5 1 5 0 0 0 5 0 0 0 5 1 0 1 5 fitted fake2 3 Fake Data Simulation 0 5 1 0 0 6 0 4 0 2 0 0 0 2 0 4 0 6 1 5 0 5 0 0 0 5 0 0 0 5 1 0 1 5 fitted fake2 4 Checking Residual Plots 0 0 fitted fake2 1 residuals fake2 4 residuals fake2 3 0 0 fitted fev lm2 0 5 residuals fake2 2 0 5 1 0 1 5 fitted fake2 2 residuals fake2 5 0 6 0 4 0 2 0 0 0 2 0 4 0 6 residuals fake2 1 residuals fev lm2 Log transformed Data 0 5 0 0 0 5 0 0 0 5 1 0 1 5 fitted fake2 5 13 13


View Full Document

UW-Madison STAT 572 - Checking Models

Download Checking Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Checking Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Checking Models and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?