Lecture 8: Multiple Linear Regression Interpretation with different types of predictorsInteractionInteractionsExample: log(LOS) ~ NURSE*MSHow does this differ from the model without the interaction? Without the adjustment?Model 1Slide 7Model 2Slide 9Model 3Slide 11ConclusionsInteractions with continuous variablesInteraction with continuous variablesSlide 15Interaction interpretationInteractions between categorical variablesInterpreting coefficientsRegression ResultsAssociation between MS and REGIONLecture 8:Multiple Linear RegressionInterpretation with different types of predictors BMTRY 701Biostatistical Methods IIInteractionAKA effect modificationAllows there to be a different association between two variables for differing levels of a third variable.Example: In the model with length of stay as an outcome, is there an interaction between medschool and nurse?Note that ‘adjustment’ is a rather weak form of accounting for a variable.Allowing an interaction allows much greater flexibility in the modelInteractionsInteractions can be formed between•two continous variables•a binary and a continuous variable•two binary variables•a binary variable and a categorical variable with >2 variables.•Etc.Three-way interaction: interaction between 3 variablesFour-way, etc.Example: log(LOS) ~ NURSE*MSiiiiiiiiiiiiiiINFRISKINFRISKINFRISKMSLOSEINFRISKMSLOSEeINFRISKMSMSINFRISKLOS)()(]1|[log]0|[log*log31203210103210How does this differ from the model without the interaction? Without the adjustment?Model 1:Model 2:Model 3:iiiiiieINFRISKMSMSINFRISKLOS *log3210iiiieMSINFRI SKLOS 210logiiieINFRISKLO S 10logModel 1> plot(data$INFRISK, data$logLOS, xlab="Infection Risk, %",ylab="Length of Stay, days", pch=16, cex=1.5)> > # Model 1:> reg1 <- lm(logLOS ~ INFRISK, data=data)> abline(reg1, lwd=2)> summary(reg1)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.93250 0.04794 40.310 < 2e-16 ***INFRISK 0.07293 0.01053 6.929 2.92e-10 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1494 on 111 degrees of freedomMultiple R-Squared: 0.302, Adjusted R-squared: 0.2957 F-statistic: 48.02 on 1 and 111 DF, p-value: 2.918e-10Model 12 3 4 5 6 7 82.0 2.2 2.4 2.6 2.8 3.0Infection Risk, %Length of Stay, daysModel 2> reg2 <- lm(logLOS ~ INFRISK + ms, data=data)> infriski <- seq(1,8,0.1)> beta <- reg2$coefficients> yhat0 <- beta[1] + beta[2]*infriski> yhat1 <- beta[1] + beta[2]*infriski + beta[3]> lines(infriski, yhat0, lwd=2, col=2)> lines(infriski, yhat1, lwd=2, col=2)> summary(reg2)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.94449 0.04709 41.295 < 2e-16 ***INFRISK 0.06677 0.01058 6.313 5.91e-09 ***ms 0.09882 0.03949 2.503 0.0138 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1459 on 110 degrees of freedomMultiple R-Squared: 0.3396, Adjusted R-squared: 0.3276 F-statistic: 28.28 on 2 and 110 DF, p-value: 1.232e-10Model 22 3 4 5 6 7 82.0 2.2 2.4 2.6 2.8 3.0Infection Risk, %Length of Stay, daysModel 3> # Model 3:> reg3 <- lm(logLOS ~ INFRISK + ms + ms:INFRISK, data=data)> infriski <- seq(1,8,0.1)> beta <- reg3$coefficients> yhat0 <- beta[1] + beta[2]*infriski> yhat1 <- beta[1] + beta[3] + (beta[2]+beta[4])*infriski> lines(infriski, yhat0, lwd=2, col=4)> lines(infriski, yhat1, lwd=2, col=4)> summary(reg3)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.947942 0.049698 39.195 < 2e-16 ***INFRISK 0.065950 0.011220 5.878 4.6e-08 ***ms 0.059514 0.178622 0.333 0.740 INFRISK:ms 0.007856 0.034807 0.226 0.822 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1466 on 109 degrees of freedomMultiple R-Squared: 0.3399, Adjusted R-squared: 0.3217 F-statistic: 18.71 on 3 and 109 DF, p-value: 7.35e-10Model 32 3 4 5 6 7 82.0 2.2 2.4 2.6 2.8 3.0Infection Risk, %Length of Stay, daysConclusionsThere does not appear to be an interaction between MEDSCHOOL and INFRISKBoth MEDSCHOOL and INFISK are associated with log(LOS), in the presence of each otherthe association between INFRISK and log(LOS) is positive: for a 1% increase in infection risk, logLOS is expected to increase by 0.07, adjusting for Med School affiliationHospitals with Med School affiliation tend to have longer average length of stay, adjusting for infection riskInteractions with continuous variablesHow to interpret with continuous variables?Example: Difference between two hospitals with a 1% difference in INFRISKiiiiiieNURSEINFRISKNURSEINFRISKLOS *log3210iiiiiiiiiiiiiNURSEDifferenceeNURSEINFRISKNURSEINFRISKLOSeNURSEINFRISKNURSEINFRISKLOS3132103210*)1()1(log*logInteraction with continuous variables> reg4 <- lm(logLOS ~ INFRISK*NURSE, data=data)> summary(reg4)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.067e+00 6.642e-02 31.120 < 2e-16 ***INFRISK 3.164e-02 1.586e-02 1.995 0.04853 * NURSE -1.025e-03 4.657e-04 -2.201 0.02986 * INFRISK:NURSE 2.696e-04 9.727e-05 2.771 0.00657 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1427 on 109 degrees of freedomMultiple R-Squared: 0.3739, Adjusted R-squared: 0.3567 F-statistic: 21.7 on 3 and 109 DF, p-value: 4.284e-112 3 4 5 6 7 82.0 2.2 2.4 2.6 2.8 3.0data$INFRISKdata$logLOSNURSE=300NURSE=100Interaction interpretation0 100 200 300 400 500 600NURSEChange in logLOS for 1% change in INFRISK0.04 0.08 0.12 0.16Interactions between categorical variablesSimple with two binary variablesMore complicated to keep track of when more than two categories in one or more variable\Example: REGION and MEDSCHOOLQuestion: Is there an interaction between REGION and MEDSCHOOL in regards to logLOS?That is: does the association between MEDSCHOOL and logLOS differ by REGION?Interpreting
View Full Document