Topic 22 - ANOVA (II) 22-1 Topic 22 – Inference of Means in a One-Way ANOVA When Variance is Constant for All Treatments Once we have rejected the null hypothesis that all means are equal, and we have checked the assumptions of the testing procedure, we usually wish to do some specific tests that can elucidate the relationships among the means. Note how this differs from regression. There, we assumed the relationship among the means was linear and so the only test of interest was of the slope. Here we made no such assumption mainly because X is categorical and so regression would make no sense. So, we want to do specific comparisons between the levels of X. These are variously called multiple comparisons tests, contrasts, or tests of linear combinations of means. A priori Hypotheses: hypotheses about population means that are decided during the planning of the experiment and prior to any data analysis. They are the reason for performing the experiment! A posteriori Hypotheses: hypotheses generated as a result of looking at the data after the experiment has been performed. Also called data snooping or data dredging. This is almost ALWAYS inappropriate and to be avoided. The only validTopic 22 - ANOVA (II) 22-2 reason for doing so is as an exploratory analysis that will guide future experimentation. Example (a posteriori testing): suppose a 1-way ANOVA is performed and the results are obtained. The analyst looks over the results and decides to test 2 means because they appear to be very different (e.g. the smallest and largest ones). Now, the effect could be due to a real difference in population means or to random occurrence due to sampling that makes them appear different. Investigating only comparisons for which the effect appears large leads to a true confidence level for a conclusion that is lower than the stated confidence level. In other words you are more likely to reject H0: not different. It can be shown that the actual confidence is 60% (!!!!) when 6 levels are used in an experiment and the statistical analysis always includes testing the difference between the largest and smallest means using a stated 95% confidence. Note also that that treatments compared each time need not be the same ones since the largest and smallest means could be for different treatments. There are times when it is possible to do a posteriori testing – BUT the statistical method needs to be modified appropriately to account for the data snooping (see later).Topic 22 - ANOVA (II) 22-3 1) Estimation of a Treatment Mean The population mean for the ith treatment iμ is estimated using the sample mean •=iiyμˆ with a standard error of iinMSEySE =•)( Under our assumptions of normality and random sampling, the (1–α)100% Confidence Interval of the ith population mean is )(,2•−•±itNiySEtyα where tNt−,2α is the critical value for the upper tail of a t-distribution on N – t df. Hypothesis testing of a single mean against a constant (μ0) is done using a t-test as is usual for a single population mean.Topic 22 - ANOVA (II) 22-4 2) Estimation of the Difference Between 2 Means The unbiased estimator of the difference between 2 population means kiDμμ−= is ••−=kiikyyDˆ which has a standard error of ⎟⎟⎠⎞⎜⎜⎝⎛+=kiiknnMSEDSE11)ˆ( assuming the variances are homogeneous (which we did assume and checked of course!). Under our assumptions of normality and random sampling, a (1–α)100% Confidence Interval of the difference of two population means (kiμμ− ) is )ˆ(ˆ,2iktNikDSEtD−±α where tNt−,2α is the critical value for the upper tail of a t-distribution on N – t degrees of freedom.Topic 22 - ANOVA (II) 22-5 Again, hypothesis testing is done using the t-test for two independent samples that we reviewed earlier this semester. EXAMPLE: Rehabilitation Therapy. A researcher is interested in the relationship between physical fitness in persons prior to knee surgery and the time required in physical therapy after surgery to obtain successful rehabilitation. 24 male subjects with a similar type of knee surgery during the past year were randomly selected from the patient records at the rehabilitation center. The number of days required for successful rehabilitation and prior physical fitness status were recorded for each patient. The patients were categorized into one of three levels of prior fitness. The hypotheses of interest are: 1) the mean time to recovery will differ among the three groups; 2) the above average fitness group will have a shorter recovery period than either the below average or average group and 3) the average group will have a shorter recovery than the below average group. In other words: 1) H0: belowaverageaboveμμμ== HA: at least one mean differs 2) H0: averageaboveμμ= HA: averageaboveμμ<Topic 22 - ANOVA (II) 22-6 H0: belowaboveμμ= HA: belowaboveμμ< 3) H0: belowaverageμμ= HA: belowaverageμμ< The SAS code and output for analyzing the dataset are: data fitness; input prior_fit $ recovery; datalines; below 29 below 42 below 38 below 40 below 43 below 40 below 30 below 42 average 30 average 35 average 39 average 28 average 31 average 31 average 29 average 35 above 26 above 32 above 21 above 20 above 23 above 22 above 25 above 23 ;Topic 22 - ANOVA (II) 22-7 proc boxplot; plot recovery*prior_fit; quit; proc glm data=fitness; class prior_fit; model recovery = prior_fit; lsmeans prior_fit / pdiff; * the pdiff option does pair-wise tests of means; quit; The output from Proc Boxplot: bel ow aver age above202530354045recoverypr i or _f i t The ANOVA table is:Topic 22 - ANOVA (II) 22-8 Sum of Source DF Squares Mean Square F Value Pr > F Model 2 792.333 396.166667 20.42 <.0001 Error 21 407.500 19.404762 CTotal 23 1199.833 So, we reject the null hypothesis that the means are all equal, i.e. there is sufficient evidence that at least one treatment mean differs from the others. But this is not conclusive until we check the assumptions. Before analyzing the means let’s quickly check the assumptions (SAS code not shown above): resid-9-8-7-6-5-4-3-2-1012345678yhat24 25 26 27 28 29 30 31 32 33 34 35 36 37 38Topic 22 - ANOVA (II) 22-9 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.98123 Pr < W 0.9174
View Full Document