Unformatted text preview:

Lecture 8: ANOVA tables F-testsANOVATotal Deviations:Regression Deviations:Error Deviations:DefinitionsExample: logLOS ~ BEDSDegrees of FreedomMean SquaresStandard ANOVA TableANOVA for logLOS ~ BEDSInference?ImplicationsF-testSlide 15Implementing the F-testSlide 17More interesting: MLRgeneral F testing approachExample of ‘nested’ modelsTesting: Models must be nested!RSlide 23Slide 24Testing more than two covariatesSlide 26Slide 27Testing multiple coefficients simultaneouslyLecture 8:ANOVA tablesF-testsBMTRY 701Biostatistical Methods IIANOVAAnalysis of VarianceSimilar in derivation to ANOVA that is generalization of two-sample t-testPartitioning of variance into several parts•that due to the ‘model’: SSR•that due to ‘error’: SSEThe sum of the two parts is the total sum of squares: SSTTotal Deviations:0 200 400 600 8002.0 2.2 2.4 2.6 2.8 3.0data$BEDSdata$logLOSYYiRegression Deviations:0 200 400 600 8002.0 2.2 2.4 2.6 2.8 3.0data$BEDSdata$logLOSYYiˆError Deviations:0 200 400 600 8002.0 2.2 2.4 2.6 2.8 3.0data$BEDSdata$logLOSiiYYˆDefinitionsSSESSRSSTYYSSEYYSSRYYSSTiii222)ˆ()ˆ()(iiiiYYYYYYˆˆExample: logLOS ~ BEDS> ybar <- mean(data$logLOS)> yhati <- reg$fitted.values> sst <- sum((data$logLOS- ybar)^2)> ssr <- sum((yhati - ybar )^2)> sse <- sum((data$logLOS - yhati)^2)> > sst[1] 3.547454> ssr[1] 0.6401715> sse[1] 2.907282> sse+ssr[1] 3.547454>Degrees of Freedom Degrees of freedom for SST: n - 1•one df is lost because it is used to estimate mean YDegrees of freedom for SSR: 1•only one df because all estimates are based on same fitted regression lineDegrees of freedom for SSE: n - 2•two lost due to estimating regression line (slope and intercept)Mean Squares“Scaled” version of Sum of SquaresMean Square = SS/dfMSR = SSR/1MSE = SSE/(n-2)Notes: • mean squares are not additive! That is, MSR + MSE ≠SST/(n-1)•MSE is the same as we saw previouslyStandard ANOVA TableSS df MSRegressionSSR 1 MSRErrorSSE n-2 MSETotalSST n-1ANOVA for logLOS ~ BEDS> anova(reg)Analysis of Variance TableResponse: logLOS Df Sum Sq Mean Sq F value Pr(>F) BEDS 1 0.64017 0.64017 24.442 2.737e-06 ***Residuals 111 2.90728 0.02619Inference?What is of interest and how do we interpret?We’d like to know if BEDS is related to logLOS.How do we do that using ANOVA table?We need to know the expected value of the MSR and MSE:22122)()()(XXMSREMSEEiImplicationsmean of sampling distribution of MSE is σ2 regardless of whether or not β1= 0If β1= 0, E(MSE) = E(MSR)If β1≠ 0, E(MSE) < E(MSR)To test significance of β1, we can test if MSR and MSE are of the same magnitude.22122)()()(XXMSREMSEEiF-testDerived naturally from the arguments just madeHypotheses:•H0: β1= 0•H1: β1≠ 0Test statistic: F* = MSR/MSEBased on earlier argument we expect F* >1 if H1 is true.Implies one-sided test.F-testThe distribution of F under the null has two sets of degrees of freedom•numerator degrees of freedom•denominator degrees of freedomThese correspond to the df as shown in the ANOVA table•numerator df = 1•denominator df = n-2Test is based on)2,1(~*  nFMSEMSRFImplementing the F-testThe decision ruleIf F* > F(1-α; 1, n-2), then reject HoIf F* ≤ F(1-α; 1, n-2), then fail to reject HoANOVA for logLOS ~ BEDS> anova(reg)Analysis of Variance TableResponse: logLOS Df Sum Sq Mean Sq F value Pr(>F) BEDS 1 0.64017 0.64017 24.442 2.737e-06 ***Residuals 111 2.90728 0.02619 > qf(0.95, 1, 111)[1] 3.926607> 1-pf(24.44,1,111)[1] 2.739016e-06More interesting: MLRYou can test that several coefficients are zero at the same timeOtherwise, F-test gives the same result as a t-testThat is: for testing the significance of ONE covariate in a linear regression model, an F-test and a t-test give the same result:•H0: β1= 0•H1: β1≠ 0general F testing approachPrevious seems simpleIt is in this case, but can be generalized to be more usefulImagine more general test:•Ho: small model•Ha: large modelConstraint: the small model must be ‘nested’ in the large modelThat is, the small model must be a ‘subset’ of the large modelExample of ‘nested’ modelsiieNURS ENURSEMSINFRISKLOS 243210iieNURSENURSEINFRISKLOS 24310iieMSINFRISKLOS 210Model 1:Model 2:Model 3:Models 2 and 3 are nested in Model 1Model 2 is not nested in Model 3Model 3 is not nested in Model 2Testing: Models must be nested!To test Model 1 vs. Model 2•we are testing that β2 = 0•Ho: β2 = 0 vs. Ha: β2 ≠ 0•If β2 = 0 , then we conclude that Model 2 is superior to Model 1•That is, if we fail to reject the null hypothesisiieNURSENURSEMSINFRISKLOS 243210iieNURSENURSEINFRISKLOS 24310Model 2:Model 1:Rreg1 <- lm(LOS ~ INFRISK + ms + NURSE + nurse2, data=data)reg2 <- lm(LOS ~ INFRISK + NURSE + nurse2, data=data)reg3 <- lm(LOS ~ INFRISK + ms, data=data)> anova(reg1)Analysis of Variance TableResponse: LOS Df Sum Sq Mean Sq F value Pr(>F) INFRISK 1 116.446 116.446 45.4043 8.115e-10 ***ms 1 12.897 12.897 5.0288 0.02697 * NURSE 1 1.097 1.097 0.4277 0.51449 nurse2 1 1.789 1.789 0.6976 0.40543 Residuals 108 276.981 2.565 ---R> anova(reg2)Analysis of Variance TableResponse: LOS Df Sum Sq Mean Sq F value Pr(>F) INFRISK 1 116.446 116.446 44.8865 9.507e-10 ***NURSE 1 8.212 8.212 3.1653 0.078 . nurse2 1 1.782 1.782 0.6870 0.409 Residuals 109 282.771 2.594 ---> anova(reg1, reg2)Analysis of Variance TableModel 1: LOS ~ INFRISK + ms + NURSE + nurse2Model 2: LOS ~ INFRISK + NURSE + nurse2 Res.Df RSS Df Sum of Sq F Pr(>F)1 108 276.981 2 109 282.771 -1 -5.789 2.2574 0.1359R> summary(reg1)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.355e+00 5.266e-01 12.068 < 2e-16 ***INFRISK 6.289e-01 1.339e-01 4.696 7.86e-06 ***ms 7.829e-01 5.211e-01 1.502 0.136 NURSE 4.136e-03 4.093e-03 1.010 0.315 nurse2 -5.676e-06 6.796e-06


View Full Document

MUSC BMTRY 701 - lect9

Documents in this Course
lect3

lect3

38 pages

lect18

lect18

17 pages

lect1

lect1

51 pages

lect12

lect12

24 pages

lect7

lect7

38 pages

lect9

lect9

29 pages

lect11

lect11

25 pages

lect13

lect13

40 pages

lect22

lect22

12 pages

lect10

lect10

40 pages

lect15

lect15

23 pages

lect14

lect14

47 pages

lect13

lect13

32 pages

lect12

lect12

24 pages

lecture18

lecture18

48 pages

lect17

lect17

29 pages

lect4

lect4

50 pages

lect4

lect4

48 pages

lect16

lect16

27 pages

lect8

lect8

20 pages

Load more
Download lect9
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lect9 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lect9 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?