Unformatted text preview:

Lecture 3 Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II Interpretation of the SLR model Assumed model E Y 0 1 X Estimated regression model Takes the form of a line Y 0 1 X 10 12 14 16 Y 0 1 X 8 Length of Stay days 18 20 SENIC data 0 200 400 Number of Beds 600 800 Predicted Values For a given individual with covariate Xi Y i 0 1 X i This is the fitted value for the ith individual The fitted values fall on the regression line 10 12 14 16 Y 0 1 X 8 Length of Stay days 18 20 SENIC data 0 200 400 Number of Beds 600 800 SENIC Data plot data BEDS data LOS xlab Number of Beds ylab Length of Stay days pch 16 reg lm data LOS data BEDS abline reg lwd 2 yhat reg fitted values points data BEDS yhat pch 16 col 3 reg Call lm formula data LOS data BEDS Coefficients Intercept 8 625364 data BEDS 0 004057 Estimating Fitted Values For a hospital with 200 beds we can calculate the fitted value as 8 625 0 00406 200 9 44 For a hospital with 750 beds the estimated fitted value is 8 625 0 00406 750 11 67 Residuals The difference between observed and fitted Individual specific Recall that E i 0 i ei Yi Y i 16 Y 0 1 X 10 12 14 i i 8 Length of Stay days 18 20 SENIC data 0 200 400 Number of Beds 600 800 R code Residuals and fitted values are in the regression object show what is stored in reg attributes reg show what is stored in summary reg attributes summary reg obtain the regression coefficients reg coefficients obtain regression coefficients and other info pertaining to regression coefficients summary reg coefficients obtain fitted values reg fitted values obtain residuals reg residuals estimate mean of the residuals mean reg residuals Making pretty pictures You should plot your regression line It will help you diagnose your model for potential problems plot data BEDS data LOS xlab Number of Beds ylab Length of Stay days pch 16 reg lm data LOS data BEDS abline reg lwd 2 A few properties of the regression line to note Sum of residuals 0 The sum of squared residuals is minimized recall least squares The sum of fitted values sum of observed values The regression line always goes through the mean of X and the mean of Y Estimating the variance Recall another parameter 2 It represents the variance of the residuals Recall what we know about estimating variances for a variable from a single population n 2 Y Y i s 2 i 1 n 1 What would this look like for a corresponding regression model Residual variance estimation sum of squares for residual sum of squares RSS residual sum of squares SSE sum of squares of errors or error sum of squares n SSE Yi Y i i 1 2 n i 1 2 i Residual variance estimation What do we divide by In single population estimation why do we divide by n 1 n Yi Y i SSE i 1 s MSE n 2 n 2 2 2 Why n 2 MSE mean square error RSE residual standard error 2 n 2 i i 1 n 2 Normal Error Regression New assumption about the distribution of the residuals i N 0 2 Also assumes independence which we had before Often we say they are iid independent and identically distributed How is this different We have now added probability to our model This allows another estimation approach Maximum Likelihood We estimate the parameters 0 1 2 using this approach instead of least squares Recall least squares we minimized Q ML we maximize the likelihood function The likelihood function for SLR Taking a step back Recall the pdf of the normal distribution f x 2 1 2 2 xi 2 exp 2 2 This is the probability density function for a random variable X For a standard normal with mean 0 and variance 1 f x 0 2 1 1 2 e xi 2 2 0 4 Standard Normal Curve e 0 1 0 2 1 2 0 0 y 0 3 y 4 2 0 x 2 4 xi 2 2 The likelihood function for a normal variable From the pdf we can write down the likelihood function pdf f x 2 1 2 2 xi 2 exp 2 2 The likelihood is the product over n of the pdfs n L 2 x i 1 1 2 2 xi 2 exp 2 2 The likelihood function for a SLR What is normal for us the residuals what is E n L 2 x i 1 1 2 2 i2 exp 2 2 Maximizing it We need to maximize with respect to the parameters But our likelihood is not written in terms of our parameters at least not all of them n L 2 x i 1 1 2 2 i2 exp 2 2 Maximizing it 0 2 4 Log x 2 now what do we do with it it is well known that maximizing a function can be achieved by maximizing it s log Why 0 2 4 6 x 8 10 Log likelihood n L 2 x i 1 1 2 2 n l 2 x log i 1 Yi 0 1 X i 2 exp 2 2 1 2 2 Yi 0 1 X i 2 exp 2 2 Still maximizing How do we maximize a function with respect to several parameters Same way that we minimize we want to find values such that the first derivatives are zero recall slope 0 take derivatives with respect to each parameter i e partial derivatives set each partial derivative to 0 solve simultaneously for each parameter estimate This approach gives you estimates of 0 1 2 2 0 1 No more math on this For details see page 32 equations 1 29 1 30c We call these estimates maximum likelihood estimates a k a MLE The results MLE for 0 is the same as the estimate via least squares MLE for 1 is the same as the estimate via least squares MLE for 2 is the same as the estimate via least squares So what is the point Linear regression is a special case of regression for linear regression Least Squares and ML approaches give same results For later regression models e g logistic poisson they differ in their estimates Going back to LS estimates what assumption did we make about the distribution of the residuals LS has fewer assumptions than ML Going forward We assume normal error regression model The main interest 1 The slope is the focus of inferences Why If 1 0 then there is no linear association between x and y But there is more than that 100 it also implies no relation of ANY type this is due to assumptions of 0 20 y Extreme example 60 constant variance equal means if 1 0 10 5 0 x 5 10 Inferences about 1 To make inferences about 1 we need …


View Full Document

MUSC BMTRY 701 - lect3

Documents in this Course
lect9

lect9

28 pages

lect18

lect18

17 pages

lect1

lect1

51 pages

lect12

lect12

24 pages

lect7

lect7

38 pages

lect9

lect9

29 pages

lect11

lect11

25 pages

lect13

lect13

40 pages

lect22

lect22

12 pages

lect10

lect10

40 pages

lect15

lect15

23 pages

lect14

lect14

47 pages

lect13

lect13

32 pages

lect12

lect12

24 pages

lecture18

lecture18

48 pages

lect17

lect17

29 pages

lect4

lect4

50 pages

lect4

lect4

48 pages

lect16

lect16

27 pages

lect8

lect8

20 pages

Load more
Download lect3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lect3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lect3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?