Example 8.2: Compare three methods for reducing hostility leSST is the “sum of squares between treatments”We compare these two quantities (MST and MSE) to decide if wIf the means are all equal, MST should equal MSE whereas if So, we look at the ratio: .F*=MST/MSETopic 17 - ANOVA (I) 17-1 Topic 17 - Inferences for Means from More than 2 Independent Populations Examples: 1) Suppose we want to compare species diversity of microfauna in three different habitats: desert, caves, and arctic tundra. a. Hypotheses and inferences are related to determining whether there are differences in mean diversity among the three habitats and if so, determining how they differ 2) Suppose we want to compare the average growth rate in oysters in one of 4 levels of salinity. a. Hypothesis might be that average growth rate increases as salinity level increases Defn: A Factor is the variable of interest. It separates the experimental units into their respective populations. Defn: A Treatment is one level of the Factor under study. If more than one factor is of interest, then a treatment is a combination of levels of the factors.Topic 17 - ANOVA (I) 17-2 Example: Oyster experiment. Factor: Treatments (levels of Factor): Populations under Study (correspond to the treatments): Completely Randomized Designs (CRD) Defn: Completely Randomized Design is an experimental design in which the experimental units are either randomly selected from each of the populations or are randomly assigned to one of the populations. Defn: Observational Study is one in which we cannot control the type of treatment performed on the experimental units. Defn: Planned Experiment is one in which the type of treatment is randomly allocated or assigned to each experimental unit.Topic 17 - ANOVA (I) 17-3 For example, suppose we are interested in the effect of three types of cover species on water clarity. Two approaches: 1) find locations with one of the three cover species and measure water clarity at those locations. 2) construct “ponds” and fill with one of the three cover species. Leave alone for a while and then measure water clarity. Which is better and why? Assumptions of the CRD: 1) Sampling: a. For observational studies, random samples are taken from each of the populations of interest. b. For planned experiments, the treatments are randomly assigned to the randomly chosen experimental units (the objects on which the experiment is to be performed). Here, the populations refer to conceptual ones in whichTopic 17 - ANOVA (I) 17-4 there is one population for each of the treatments in the experiment. c. Samples are independent. Example 2: independent sampling here would mean that oysters were randomly selected for the experiment (no clumps of oysters were taken and then separated, oysters were taken from different locations, oysters were not selected by size, etc) and further that, if the experiment was planned, the oysters were randomly assigned treatment levels. 2) Homogeneous Variance: we shall assume that the populations of interest all have the same variability, i.e. they all have the same variance (can be relaxed but that is for a later time) 3) Approximate Normality: we assume that each population is normally distributed (can be relaxed but again is for a later time)Topic 17 - ANOVA (I) 17-5 Example 1: three habitats (desert, caves, arctic tundra). The variable of interest is species diversity, Y. The experimental unit might be a sq. km randomly selected from a map showing the spatial extent of the habitat. The population for a habitat is the species diversity values for every possible sq. km. in the habitat within the larger geographic region under study. Hence, there are three populations (one for each habitat). By assumption, 1) Species diversity in each habitat has a normal distribution 2) The three populations (habitats) have the same variance, that is 2222εσσσσ===arcticcavesdesert Our interest is in testing whether the means of these three populations differs, that is our claim is that either arcticcavesdesertµµµ≠≠ or at least some subset of the means is not equal Hypothesis: H0: µdesert = µcaves = µarctic HA: at least one mean differs from the othersTopic 17 - ANOVA (I) 17-6 In a picture one possible version of the alternative hypothesis might look like: The populations under the null hypothesis would look like:Topic 17 - ANOVA (I) 17-7 Defn: A One-Way Analysis of Variance (1-way ANOVA or AOV) is the statistical method for testing and comparing means from 2 or more independent populations when the response variable is continuous. Notation t the number of populations of interest (also the number of treatments) ni sample size for the ith treatment or population, i = 1,2,…,t N , the total sample size ∑==tiin1yij observed value for the jth experimental unit sampled from the ith population, j = 1,2,…, ni and i = 1,2,…,T •iy injijnyi∑==1, the mean of the ith sample ••y Nytinjiji∑∑===11, the overall mean of the combined samples MSE tNyytinjiiji−−=∑∑==•112)(, the sample estimator of the variance 2εσ and is called the Mean Squared ErrorTopic 17 - ANOVA (I) 17-8 Example 8.2: Compare three methods for reducing hostility levels in students known to have a certain level of hostility. A total of twenty-four students were randomly assigned to one of the three methods. Method 1 was assigned to eight students, method 2 to seven students, and method 3 to nine students. After treatment, each student was given a test and the scores were recorded. The data are: Method 1 2 3 y11 = 96 y21 = 77 y31 = 66 y12 = 79 y22 = 76 y32 = 73 y13 = 91 y23 = 74 y33 = 69 y14 = 85 y24 = 73 y34 = 66 y15 = 83 y25 = 78 y35 = 77 y16 = 91 y26 = 71 y36 = 73 y17 = 82 y27 = 80 y37 = 71 y18 = 87 y38 = 70 y39 = 74 Sum 694 529 639 Sample mean 75.861=•y 57.752=•y00.713=•ySample Size n1 = 8 n2 = 7 n3 = 9Topic 17 - ANOVA (I) 17-9 Experimental unit: Treatments: Factor: t = N = 583.77978639529694=++++=••y Model: ijiijiijYεµεαµ+=++= where • µ is the overall (grand) mean, • µi is the ith treatment mean, • αi (= µ − µi) is the deviation of the ith treatment mean from the overall mean, and, • εij )(•−=iijYµ is called the error term, i.e. it is the deviation of the jth observation, Yij, from the ith treatment mean.Topic
View Full Document