Course Analysis of Variance Topic Intro to ANOVA 1 INTRODUCTION TO ANALYSIS OF VARIANCE Analysis of variance ANOVA is one of the most frequently used data analysis techniques ANOVA can be used to test whether two or more populations differ in terms of an interval or ratio scale variable When we have only two groups ANOVA and the t test produce the same results i e F t2 ANOVA is very flexible and enables the researcher to test for group differences while controlling for the effects of a continuous variable Analysis of Covariance and can simultaneously test the main and interactive effects of two or more variables Factorial ANOVA We will spend the semester discussing the details of ANOVA Before jumping into the details however we will begin with a conceptual overview There are actually two approaches to conceptualizing ANOVA The first approach is the traditional variance partitioning strategy which you likely covered in an undergraduate statistics course The second approach is the model comparison perspective which is used by our text and considered to be a more flexible approach in that it treats ANOVA as a variant of the general linear model GLM The GLM can be used for ANOVA in addition to a number of other statistical procedures such as multiple regressions Keep in mind that the variance partitioning and model comparison approaches are two different perspectives on the same technique and consequently both approaches produce the same results We ll begin with an overview of both approaches and then we ll examine how to operationalized these abstract ideas with computer software THE VARIANCE PARTITIONING APPROACH Let s start with an example to facilitate our discussion Pretend we are interested in testing the efficacy of two treatments Smiling Therapy and Exercise Therapy for depression The table below presents the post treatment depression scores higher numbers indicate greater depression for nine depressed patients who were randomly assigned to one of three treatments no treatment smiling therapy exercise therapy No Therapy 7 7 6 Smiling Therapy 2 1 1 Exercise Therapy 3 1 2 x 6 67 s 0 58 x 1 33 s 0 58 x 2 00 s 1 00 The variance partitioning approach to ANOVA partitions the variability in the dependent variable e g depression into two additive components between treatment variation and withintreatment variation The between treatment variation accounts for variance among groups i e it appears as if the treatments differ in terms of their average depression scores The withintreatment variation accounts for variance within each group i e not all of the scores within each group are the same Variation among groups may arise from treatment effects i e the populations from which the samples were derived differ and error i e random variation The error can be thought of as any non systematic source of variation such as differences between participants variables associated with the dependent variable that varied randomly across treatments and non systematic differences in the manner in which the participants were treated Variation within groups can be attributed only to error i e individual differences randomly varying variables and nonsystematic treatment of participants Notice that the between and within sources of variation differ Course Analysis of Variance Topic Intro to ANOVA 2 only in regard to the treatment effect Consequently we can assess the extent to which there is a treatment effect i e differences among populations with a ratio of between treatment to withintreatment variation F between treatment var iation Treatment Effect Error within treatment var iation Error If there is no treatment effect i e the populations do not differ we expect the F ratio to equal 1 To the extent to which there is a treatment effect we expect the f ratio to be greater than 1 The variance partitioning approach calculates the F ratio by comparing variance between groups and variance within groups Recall that variance is simply the squared standard deviation and for a sample the formula for variance is SS X X 2 s2 n 1 df Recall that the numerator of the formula SS sum of squares is the sum of each score s squared deviation from the mean and the denominator adjusts the numerator by one less than the number of observations The denominator is also referred to as degrees of freedom df Degrees of freedom reflect the amount of independent information For example if we have 5 scores with a mean of 30 there are 4 df That is 4 scores are free to take on any value while the value of the 5th score depends on the values of the other scores to produce a mean of 30 In ANOVA terminology variance is referred to as Mean Square MS variance is the squared standard deviation and the standard deviation reflects the mean or average variation i e mean variation squared So the Fvalue is a ratio of MS between and MS within and each mean square is a ratio of SS and df SS between F MS between df between SS within MS within df within Given the numerous pieces of information the F is typically summarized in an ANOVA table The following ANOVA table contains the formulas for calculating each component Source Between Within SS df n j Y j Y 2 k 1 SS between Yij Y j 2 N k SS within Y Y 2 N 1 j j Total i MS df between df within F MS between MS within k number of groups n j number of scores in jth group nj N total number of scores i e j Y grand mean i e mean of all the Y scores Y j mean of the jth group Yij score of the ith person in the jth group Given that the between treatment variation accounts for variation among the treatment groups SSbetween is computed see table as the squared deviation of each group mean from the grand mean Course Analysis of Variance Topic Intro to ANOVA 3 weighted i e multiplied by the number of persons in each group Likewise because SSwithin accounts for variation within the treatment groups it is computed see table as the sum of squared deviations of each score from its group s mean Finally it should be noted that SSbetween and SSwithin are additive components of SStotal SStotal SSbetween SSwithin SStotal reflects the total variation in the dependent variable and is computed see table as the sum of square deviations of each score from the grand mean i e mean of all the scores The additive relationship of SSbetween and SSwithin reflects the variance portioning strategy of ANOVA and indicates that variation in the dependent variable can be divided into variation attributable to the independent variable
View Full Document