Cal Poly STAT 218 - Comparing Several Means: Analysis of Variance

Unformatted text preview:

Stat 218 - Day 28 Comparing Several Means: Analysis of Variance Today we return to analyzing quantitative variables. Several weeks ago we studied two-sample t-tests for comparing means between two groups. This week we’ll extend that procedure to comparing means between several groups. This resulting procedure is called analysis of variance, abbreviated ANOVA. Example: Discrimination against handicapped In a study of whether physical handicaps affect perceptions of employment qualifications, researchers showed videotapes of an employer interviewing an applicant, but both men were actors following a set script. In the videotapes the “applicant” appeared with different handicaps (none, amputee, crutches, hearing, wheelchair). A group of seventy students were randomly assigned to watch a videotape and evaluate the applicant’s qualifications on a ten-point scale. The study aimed to assess whether the data provide evidence that the mean qualification scores differ significantly among the various handicaps presented. (a) Is this an observational study or an experiment? Explain. (b) Identify the explanatory and the response variable. For each one, indicate whether it is quantitative or categorical. ANOVA applies to situations like this where the explanatory variable is categorical and the response variable is quantitative. If the categorical explanatory variable was also binary, then we could use a two-sample t-test to analyze the data. But in this situation we have five categories to compare, and we need a test that will compare all five simultaneously. Before we get to the test, though, we start with graphical and numerical summaries of the data. (c) Examine dotplots and boxplots of the qualification ratings among the five groups (HandicapApply.mtw). Then examine summary statistics. Do the groups appear to differ with regard to qualification ratings? Explain.Now some notation: • I = number of groups • µi = population mean for group i • ni = sample size for group i • •iy = sample mean for group i • si = sample standard deviation for group i • n* = grand (overall) sample size • ••y = grand (overall) mean for all groups combined • yij = jth observation in group i We want to test the null hypothesis that all population means are equal (H0: µ1 = µ2 = µ3 = µ4 = µ5). The key idea of ANOVA is to compare variability between groups to variability within groups. We’ll measure this variability through sums of squares and mean squares. • Total sum of squares: SS(total) = ()∑∑==••−Iinjijiyy112 • Total degrees of freedom: df(total) = n* - 1 (d) Calculate SS(total) for the handicap study. • Within-group sum of squares: SS(within) = ()()∑∑∑===•−=−IiiiIinjiijsnyyi121121 • Within-group degrees of freedom: df(within) = n* - I • Within-group mean square: MS(within) = SS(within) / df(within) (e) Calculate SS(within), df (within), and MS(within) for the handicap study.ANOVA combines information on variability within all of the groups to create a pooled estimate of the standard deviation within a group: • MS(within)=pooleds (f) Calculate the pooled standard deviation for the handicap study. • Between-group sum of squares: SS(between) = ()∑=•••−Iiiiyyn12 • Between-group degrees of freedom: df(between) = I - 1 • Between-group mean square: MS(between) = SS(between) / df(between) (g) Calculate SS(between), df (between), and MS(between) for the handicap study. • SS(total) = SS(between) + SS(within) (h) Verify that this relationship holds for the handicap study. To organize all of these calculations, we construct an ANOVA table. (i) Create an ANOVA table for the handicap study. Then use Minitab to verify this table (Stat> Anova> Oneway).• Test statistic: Fs = MS(between) / MS(within) (j) Key question: Do large or small values of Fs provide evidence against the null hypothesis that all population means are equal? Explain. (k) Calculate the value of the test statistic Fs for the handicap study. Is this value large? To answer this, we need to compare it to a reference distribution in order to calculate the P-value. • P-value: obtained from F-distribution (Table 10) with numerator df = df(between) and denominator df = df(within) (k) Determine the P-value as accurately as possible. (l) Would you reject the null hypothesis at the α=.05 significance level? Explain. (m) Summarize the conclusion that you would draw from this study. (n) What would change in the ANOVA table if the sample means were further apart? How would the P-value change? How would your conclusion change? Explain. (o) What would change in the ANOVA table if the qualification ratings in each group were further apart? How would the P-value change? How would your conclusion change?


View Full Document

Cal Poly STAT 218 - Comparing Several Means: Analysis of Variance

Download Comparing Several Means: Analysis of Variance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Comparing Several Means: Analysis of Variance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Comparing Several Means: Analysis of Variance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?