Unformatted text preview:

Lecture 7: Multiple Linear Regression Interpretation with different types of predictorsInterpreting regression coefficientsBinary covariatesSlide 4How is MEDSCHL related to LOS?How is MEDSCHOOL related to LOS?R codeScatterplot? Residual Plot?Fitted Values“Linear” regression?MLR example: Add infection risk to our modelWhat about more than two categories?LOS ~ REGIONInterpretationhypothesis tests?RInterpretingMake REGION=4 the referenceR code: recoding so last category is referenceQuite a few differences:But the “model” is the sameDiagnosticsSlide 23Next type: polynomialsScatterplotFitting the modelHow does it fit?Another approach to the same dataSmoothingSlide 30Inference?Implementing a splineHow to interpret?RInterpreting the outputWhy subtract the 250 in defining nurse.star?Why a spline vs. the quadratic?Next timeLecture 7:Multiple Linear RegressionInterpretation with different types of predictorsBMTRY 701Biostatistical Methods IIInterpreting regression coefficientsSo far, we’ve considered continuous covariatesCovariates can take other forms:•binary•nominal categorical•quadratics (or other transforms)•interactionsInterpretations may vary depending on the nature of your covariateBinary covariatesConsidered ‘qualitative’The ordering of numeric assignments does not matterExamples: MEDSCHL: 1 = yes; 2 = noMore popular examples:•gender•mutation•pre vs. post-menopausal•Two age categoriesHow is MEDSCHL related to LOS?How to interpret β1?Coding of variables:•2 vs. 1•i prefer 1 vs. 0•difference? the intercept.Let’s make a new variable: •MS = 1 if MEDSCHL = 1 (yes)•MS = 0 if MEDSCHL = 2 (no)iiieMEDSCHLLOS 10How is MEDSCHOOL related to LOS?What does β1 mean?Same, yet different interpretation as continuousWhat if we had used old coding?101001010101]1|[0]0|[][iiiiiieiiMSLOSEMSLOSEMSLOSEeMSLOS101010101022]2|[1]1|[iiiieiiMEDSCHLLOSEMEDSCHLLOSEeMEDSCHLLOSR code> data$ms <- ifelse(data$MEDSCHL==2,0,data$MEDSCHL)> table(data$ms, data$MEDSCHL) 1 2 0 0 96 1 17 0> > reg <- lm(LOS ~ ms, data=data)> summary(reg)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.4105 0.1871 50.290 < 2e-16 ***ms 1.5807 0.4824 3.276 0.00140 ** ---Residual standard error: 1.833 on 111 degrees of freedomMultiple R-squared: 0.08818, Adjusted R-squared: 0.07997 F-statistic: 10.73 on 1 and 111 DF, p-value: 0.001404Scatterplot? Residual Plot?0.0 0.2 0.4 0.6 0.8 1.0-2 0 2 4 6 8 10data$msresres <- reg$residualsplot(data$ms, res)abline(h=0)0.0 0.2 0.4 0.6 0.8 1.08 10 12 14 16 18 20data$msdata$LOSFitted ValuesOnly two fitted values:Diagnostic plots are not as informativeExtrapolation and Interpolation are meaningless!We can estimate LOS for MS=0.5•LOS = 9.41 + 1.58*0.5 = 10.20•Try to interpret the result…99.1058.141.9]1|ˆ[41.9]0|ˆ[100MSYMSY“Linear” regression?But, what about ‘linear’ assumption?•we still need to adhere to the model assumptions•recall that they relate to the residuals primarily•residuals are independent and identically distributed:The model is still linear in the parameters!),0(~2NiMLR example: Add infection risk to our model> reg <- lm(LOS ~ ms + INFRISK, data=data)> summary(reg)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.4547 0.5146 12.542 <2e-16 ***ms 0.9717 0.4316 2.251 0.0263 * INFRISK 0.6998 0.1156 6.054 2e-08 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.595 on 110 degrees of freedomMultiple R-squared: 0.3161, Adjusted R-squared: 0.3036 F-statistic: 25.42 on 2 and 110 DF, p-value: 8.42e-10 How does interpretation change?What about more than two categories?We looked briefly at region a few lectures back.How to interpret?You need to define a reference categoryFor med school:•reference was ms=0•almost ‘subconscious’ with only two categoriesWith >2 categories, need to be careful of interpretationLOS ~ REGIONNote how ‘indicator’ or ‘dummy’ variable is defined:•I(condition) = 1 if condition is true•I(condition) = 0 if condition is falseiiiiieRIRIRILOS  )4()3()2(3210Interpretationβ0 = β1 =β2 =β3 =3020100]4|[]3|[]2|[]1|[iiiiiiiiRLOSERLOSERLOSERLOSEiiiiieRIRIRILOS  )4()3()2(3210hypothesis tests?Ho: β1 = 0Ha: β1 ≠ 0What does that test (in words)?Ho: β2 = 0Ha: β2 ≠ 0What does that test (in words)?What if we want to test region, in general?One of our next topics!R > reg <- lm(LOS ~ factor(REGION), data=data)> summary(reg)Call:lm(formula = LOS ~ factor(REGION), data = data)Residuals: Min 1Q Median 3Q Max -3.05893 -1.03135 -0.02344 0.68107 8.47107 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.0889 0.3165 35.040 < 2e-16 ***factor(REGION)2 -1.4055 0.4333 -3.243 0.00157 ** factor(REGION)3 -1.8976 0.4194 -4.524 1.55e-05 ***factor(REGION)4 -2.9752 0.5248 -5.669 1.19e-07 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.675 on 109 degrees of freedomMultiple R-squared: 0.2531, Adjusted R-squared: 0.2325 F-statistic: 12.31 on 3 and 109 DF, p-value: 5.376e-07InterpretingIs mean LOS different in region 2 vs. 1?What about region 3 vs. 1 and 4 vs. 1?What about region 4 vs. 3?How to test that?Two options?•recode the data so that 3 or 4 is the reference•use knowledge about the variance of linear combinations to estimate the p-value for the difference in the coefficientsFor now…we’ll focus on the first.Make REGION=4 the referenceOur model then changes:iiiiieRIRIRILOS  )1()2()3(32100102030]4|[]3|[]2|[]1|[iiiiiiiiRLOSERLOSERLOSERLOSER code: recoding so last category is reference> data$rev.region <- factor(data$REGION, levels=rev(sort(unique(data$REGION))) )> reg <- lm(LOS ~ rev.region, data=data)>


View Full Document

MUSC BMTRY 701 - lect7

Documents in this Course
lect3

lect3

38 pages

lect9

lect9

28 pages

lect18

lect18

17 pages

lect1

lect1

51 pages

lect12

lect12

24 pages

lect9

lect9

29 pages

lect11

lect11

25 pages

lect13

lect13

40 pages

lect22

lect22

12 pages

lect10

lect10

40 pages

lect15

lect15

23 pages

lect14

lect14

47 pages

lect13

lect13

32 pages

lect12

lect12

24 pages

lecture18

lecture18

48 pages

lect17

lect17

29 pages

lect4

lect4

50 pages

lect4

lect4

48 pages

lect16

lect16

27 pages

lect8

lect8

20 pages

Load more
Download lect7
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lect7 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lect7 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?