MUSC BMTRY 701 - lect11 - D2191933

Home> Schools> Medical University of South Carolina> (BMTRY) > BMTRY 701> lect11

DOC PREVIEW

MUSC BMTRY 701 - lect11

School name Medical University of South Carolina

Course Bmtry 701- Biostatistical Methods II

Pages 25

This preview shows page 1-2-24-25 out of 25 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture 11 MulticollinearityMulticollinearity IntroductionEasy answers?MulticollinearityNo Multicollinearity Example: Mouse experimentLinear modelingDose of Drug A on TumorDose of Drug B on TumorDiet on TumorAll in the model togetherCorrelation matrix of predictors and outcomeResultThe other extreme: perfect collinearityThe model has infinitely many solutionsEffects of MulticollinearitySlide 16Implications in inferenceSlide 18Implications in InferenceSo, which is the ‘correct’ variable?ExampleSENICSlide 23Let’s try an example with serious multicollinearitySlide 25Lecture 11MulticollinearityBMTRY 701Biostatistical Methods IIMulticollinearity IntroductionSome common questions we ask in MLR•what is the relative importance of the effects of the different covariates?•what is the magnitude of effect of a given covariate on the response?•can any covariate be dropped from the model because it has little effect or no effect on the outcome?•should any covariates not yet included in the model be considered for possible inclusion?Easy answers?If the candidate covariates are uncorrelated with one another: yes, these are simple questionsIf the candidate covariates are correlated with one another: no, these are not easy.Most commonly:•observational studies have correlated covariates•we need to adjust for these when assessing relationships•“adjusting” for confoundersExperimental designs?•less problematic•patients are randomized in common designs•no confounding exists because factors are ‘balanced’ across armsMulticollinearityAlso called “intercorrelation”refers to the situation when the covariates are related to each other and to the outcome of interestlike confounding, but a statistical terminology for it because of the effects it has on regression modelingNo Multicollinearity Example: Mouse experimentMouse Dose A Dose B Diet Tumor size1 100 25 0 452 200 25 0 563 300 25 0 254 100 50 0 155 200 50 0 176 300 50 0 107 100 25 1 308 200 25 1 289 300 25 1 2010 100 50 1 1011 200 50 1 512 300 50 1 3Linear modelingInterested in seeing which factors influence tumor size in miceNotice that the experiment is perfectly balanced.What does that mean?Dose of Drug A on Tumor> reg.a <- lm(Tumor.size ~ Dose.A, data=data)> summary(reg.a)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.50000 12.29041 2.644 0.0246 *Dose.A -0.05250 0.05689 -0.923 0.3779 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 16.09 on 10 degrees of freedomMultiple R-Squared: 0.07847, Adjusted R-squared: -0.01368 F-statistic: 0.8515 on 1 and 10 DF, p-value: 0.3779 >Dose of Drug B on Tumor> reg.b <- lm(Tumor.size ~ Dose.B, data=data)> summary(reg.b)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 58.0000 9.4956 6.108 0.000114 ***Dose.B -0.9600 0.2402 -3.996 0.002533 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 10.4 on 10 degrees of freedomMultiple R-Squared: 0.6149, Adjusted R-squared: 0.5764 F-statistic: 15.97 on 1 and 10 DF, p-value: 0.002533 >Diet on Tumor> reg.diet <- lm(Tumor.size ~ Diet, data=data)> summary(reg.diet)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 28.000 6.296 4.448 0.00124 **Diet -12.000 8.903 -1.348 0.20745 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 15.42 on 10 degrees of freedomMultiple R-Squared: 0.1537, Adjusted R-squared: 0.06911 F-statistic: 1.817 on 1 and 10 DF, p-value: 0.2075All in the model together> reg.all <- lm(Tumor.size ~ Dose.A + Dose.B + Diet, data=data)> summary(reg.all)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 74.50000 8.72108 8.543 2.71e-05 ***Dose.A -0.05250 0.02591 -2.027 0.077264 . Dose.B -0.96000 0.16921 -5.673 0.000469 ***Diet -12.00000 4.23035 -2.837 0.021925 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 7.327 on 8 degrees of freedomMultiple R-Squared: 0.8472, Adjusted R-squared: 0.7898 F-statistic: 14.78 on 3 and 8 DF, p-value: 0.001258Correlation matrix of predictors and outcome> cor(data[,-1]) Dose.A Dose.B Diet Tumor.sizeDose.A 1.0000000 0.0000000 0.0000000 -0.2801245Dose.B 0.0000000 1.0000000 0.0000000 -0.7841853Diet 0.0000000 0.0000000 1.0000000 -0.3920927Tumor.size -0.2801245 -0.7841853 -0.3920927 1.0000000>ResultFor perfectly balanced designs, adjusting does not affect the coefficientsHowever, it can affect the significanceWhy?•residual sum of squares is affected•if you explain more of the variance in the outcome, less is left to chance/error•when you adjust for another related factor, you will likely improve the significanceThe other extreme: perfect collinearityMouse Dose A Dose C Diet Tumor size1 100 100 0 452 200 300 0 563 300 500 0 254 100 100 0 155 200 300 0 176 300 500 0 107 100 100 1 308 200 300 1 289 300 500 1 2010 100 100 1 1011 200 300 1 512 300 500 1 3The model has infinitely many solutionsToo much flexibilityWhat happens?The fitting algorithm usually gives you some indication of this•will not fit the model and gives an error•drops one of the predictors“perfectly collinear” = “perfect confounding”Effects of MulticollinearityMost common result•two covariates are independently associated with Y in simple linear regression models•in MLR model with both covariates, one or both is insignificant•the magnitude of the regression coefficients is attenuated•why?recall the adjusted variable plotif the two are related, removing the systematic part of one from Y may leave too little left to explainEffects of MulticollinearityOther situations•Neither is significant alone, but they are both significant together (somewhat rare)•Both are significant alone and both retain signficance in the model•The regression coefficient for one of the covariates may change direction•Magnitude of coefficient may increase (in absolute value)It is usually hard to predict exactly what will happen when both are in the modelImplications in inferencethe interpretation of a regression coefficient measuring the change in the

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-24-25 out of 25 pages.

MUSC BMTRY 701 - lect11

Sign up for free to view:

Please select your school