MUSC BMTRY 701 - lect4 - D328522

Home> Schools> Medical University of South Carolina> (BMTRY) > BMTRY 701> lect4

DOC PREVIEW

MUSC BMTRY 701 - lect4

School name Medical University of South Carolina

Course Bmtry 701- Biostatistical Methods II

Pages 48

This preview shows page 1-2-3-23-24-25-26-46-47-48 out of 48 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 48 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture 4: Inference in SLR (continued) Diagnostic approaches in SLRA little more in inference of β’sConfidence IntervalsSENIC dataMore meaningful:Slide 6What would this look like?SENIC data: 95% CI for slopeImportant implicationA few comments r.e. inferencesSpread of the X’sR codeSlide 13Interval Estimation of Y’sMean versus individualInterval estimationExample:Interval estimationExample:PredictionSlide 21Added variability in prediction intervalsPrediction intervalRevisit our exampleSlide 25Nearer to the mean?Slide 27Slide 28DiagnosticsDiagnostic Considerations via ResidualsSeveral flavors of residual plotsClassic diagnostic tool: residual plot What can you see from here?Residuals vs. XNormality of ResidualsConstant variance assumptionSlide 36Slide 37Another Approach for Constant Variance TestSlide 39Lack of linear fitExample of lack of linear fitCurvature in the model?Slide 43Independence?Residuals by RegionAdjust for RegionAdjust for Region (continued)So what do you think?Lecture 4:Inference in SLR (continued)Diagnostic approaches in SLRBMTRY 701Biostatistical Methods IIA little more in inference of β’sConfidence interval for β1This follows easily from discussion of t-test Recall sampling distribution for slope:From this, the 95% CI follows:))ˆ(,(~ˆ1211N)ˆ(ˆˆ12,975.01ntConfidence IntervalsMore generally,And, the same approach is used for the intercept (if you would care to):)ˆ(ˆˆ12,112nt)ˆ(ˆˆ02,102ntSENIC data> reg <- lm(data$LOS~ data$BEDS)> summary(reg)$coefficients Estimate Std. Error t value Pr(>|t|)(Intercept) 8.625364302 0.272058856 31.704038 1.851535e-57data$BEDS 0.004056636 0.000858405 4.725782 6.765452e-06> qt(0.975,111)[1] 1.98156795% CI for β1:0.00406 +/- 1.98*0.000858 = {0.00236, 0.00576}More meaningful:what about the difference in LOS for a 100 bed difference between hospitals?Go back to sampling distribution:for a 100 unit difference:))ˆ(,(~ˆ1211N))ˆ()100(,100(~ˆ10012211NMore meaningful:So that implies that the CI takes the formHence, simply multiply the 95% CI limits by 100:)}ˆ(ˆ100{ˆ10012,112nt95% CI for 100*β1:100* 0.00406 +/- 1.98*100*0.000858 = {0.236, 0.576}What would this look like?Recall that the regression line always goes through the means of X and Y.We can add our 95% CI limits of the slope to our scatterplot by finding the intercept for the regression line will go through the means. > mean(data$LOS)[1] 9.648319> mean(data$BEDS)[1] 252.1681# use these as x and y values. then, use each# of the slopes to find corresponding intercepts> abline(8.198, 0.00576, lty=2)> abline(9.055, 0.00236, lty=2)SENIC data: 95% CI for slope0 200 400 600 8008 10 12 14 16 18 20Number of BedsLength of Stay (days)Important implicationThe slope and intercept are NOT independentNotice what happens to the intercept if we increase the slope?What happens if we decrease the slope?> attributes(summary(reg))$names [1] "call" "terms" "residuals" "coefficients" [5] "aliased" "sigma" "df" "r.squared" [9] "adj.r.squared" "fstatistic" "cov.unscaled" $class[1] "summary.lm"> summary(reg)$cov.unscaled (Intercept) data$BEDS(Intercept) 2.411664e-02 -6.054327e-05data$BEDS -6.054327e-05 2.400909e-07A few comments r.e. inferencesWe assume Y|X ~ Normalif this is “seriously” violated, our inferences may not be valid.But, no surprise, a large sample size will save usSlope and intercept sampling distributions are asymptotically normalSpread of the X’sRecall the estimate of the standard error for the slope:What happens to the standard error when the spread of the X’s is narrow?What happens to the standard error when the spread of the X’s is wide?(Note: intercept is similarly susceptible)2212)(ˆ)ˆ(ˆXXiR code################### simulate datax1 <- runif(100,0,10)x2 <- runif(100,3,7)e <- rnorm(100,0,3)y1 <- 2 + 1.5*x1 + ey2 <- 2 + 1.5*x2 + eplot(x1, y1)points(x2, y2, col=2)# fit regression models reg1 <- lm(y1 ~ x1)reg2 <- lm(y2 ~ x2)abline(reg1)abline(reg2, col=2)# compare standard errorssummary(reg1)$coefficientssummary(reg2)$coefficients0 2 4 6 8 100 5 10 15 20x1y1Interval Estimation of Y’sRecall the model:We might be interested in estimation of a predicted Y. This means, for example, “What is true mean LOS when number of beds is 400?”It does NOT mean “What is value of LOS when number of beds is 400?”XYE10)(Mean versus individualKeep it straight: can be confusing.Using previous results,niijjjjjXXXXnYwhereYYENY122222)()(1)ˆ())ˆ(),((~ˆInterval estimation Normality: follows from residuals, slope, and intercept being normal. Mean: easily shown by substituting in slope and interceptVariance: a little more detail•variability depends on distance of X from mean of X•Recall plots of 95% CIs•variation in slope has greater impact at extremes of X than in the middle•We substitute our estimate of MSE and then we have a t-distributionExample: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.6253643 0.2720589 31.704 < 2e-16 ***data$BEDS 0.0040566 0.0008584 4.726 6.77e-06 ***---Residual standard error: 1.752 on 111 degrees of freedom> mean(data$BEDS)[1] 252.1681> sum( (data$BEDS - mean(data$BEDS))^2) [1] 4165090 0433.0752.1)ˆ(ˆ4165090)2.252400(1131222jYInterval estimationUse our standard confidence interval approach:Note that what differs is the way the standard error is calculated.Otherwise, all of the these tests and intervals follow the same pattern.)ˆ(ˆˆ2,12jnjYtYExample:}67.10,842.9{)ˆ(ˆ*98.1ˆ:%95208.00433.0)ˆ(ˆ254.10400*00406.063.8ˆjjjjYYCIYY0 200 400 600 8008 10 12 14 16 18 20Number of BedsLength of Stay (days)PredictionWe’d like to know what to expect for NEW observationsExample: if we added another hospital to our dataset with 400 beds, what is the likely observed mean LOS for that hospital?“Prediction interval”Intuition:•we are making inference about an individual hospital, not the mean of all hospitals•it should be wider than the confidence interval for the mean of Y|XPredictionCan be

View Full Document