DOC PREVIEW
UW-Madison STAT 371 - STAT 371 Lecture Notes

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Analysis of VarianceBret LargetDepartment of StatisticsUniversity of Wisconsin - MadisonNovember 18, 2004Statistics 371, Fall 2004Analysis of Variance• Analysis of variance (ANOVA) is a stat istical procedure foranalyzing data that may be treated as multiple indep endentsamples with a single quantitative measurement for eachsampled individual.• ANOVA is a generalization of the methods we saw earlier inthe course for two-independent samples.• The bucket of balls m odel is that we have I different bucketsof balls, each of which contains numbered balls.• The populations means and standard deviations of thenumbers in each bucket are µiand σirespectively for i =1, . . . , I.• In ANOVA, we often assume that all of the populationstandard deviations are equal.Statistics 371, Fall 2004 1Cuckoo Birds• Cuckoo birds have a behavior in which they lay their eggs inother birds nests.• The other birds then raise and care for the newly hatchedcuckoos.• Cuckoos return year after year to the same territory and laytheir eggs in the nests of a particular host species.• Furthermore, cuckoos appear to mate only within theirterritory.• Therefore, geographical sub-species are developed, each witha dominant foster-parent species.• A general question is, are the eggs of the different sub-speciesdistinct so that they are adapted to a particular foster-parentspecies?• Specifically, we ca n ask, are the mean lengths of the cuckooeggs the same in the different sub-species?Statistics 371, Fall 2004 2Display of Cuckoo Bird Egg LengthsHere is a plot of egg lengths (mm) of cuckoo bird eggscategorized by the species of the host bird.HedgeSparrow MeadowPipet PiedWagtail Robin TreePipet Wren20 21 22 23 24 25birdSpeciesStatistics 371, Fall 2004 3A Dotplot of the DataHedgeSparrow MeadowPipet PiedWagtail Robin TreePipet Wren20 21 22 23 24 25Statistics 371, Fall 2004 4The Big Picture• ANOVA is a statistical procedure where we test the nullhypothesis that all population mean are equal versus thealternative hypothesis that they are not all equal.• The test statistic is a ratio of the variability among samplemeans over the variability within sample means.• When this ratio is large, this indicates evidence against thenull hypothesis.• The test statistic will have a different form than what we havepreviously seen. The null distribution is an F distribution,named after Ronald Fisher.• An ANOVA table is an accounting method for computing thetest statistic.• We introduce a lot of new notation on the way. . . .Statistics 371, Fall 2004 5NotationThis notation is used to describe calculations of variability withinsamplesand variability among samples, although for historicalreasons of poor grammar, the termbetween samples is morecommonly used.yij= the jth observation in the ith groupI = the number of groupsni= the ith sample size¯yi·= the mean of the ith samplen∗=IXi=1ni= the total number of observations¯y··=PIi=1Pnjj=1yijn∗= the grand meanStatistics 371, Fall 2004 6Sums of Squares within GroupsWe measure variability by sums of squared deviations. The sumsof squares within groups, or SS(within), is a combined measureof the variability within all groups.SS(within) =IXi=1njXj=1(yij− ¯yi·)2=IXi=1(ni− 1)s2iNotice that this measure of variability is aweighted sum of thesample variances where the weights are the degrees of freedomfor each respective sample.Statistics 371, Fall 2004 7Degrees of Freedom• The degrees of freedom within samples is simply the sum ofdegrees of freedom for each sample.• This is equal to the total number of observations minus thenumber of groups.df(within) =IXi=1(ni− 1)= n∗− IStatistics 371, Fall 2004 8Mean Square Within• In ANOVA, a mean square will be the ratio of a sum ofsquares over the corresponding degrees of freedom.MS(within) =SS(within)df(within)=(n1− 1)s21+ · · · + (nI− 1)s2In∗− I• In other words, the mean square within is a weighted averageof the sample variances where the weights are the degrees offreedom within each sample.• The square root of the mean sqaure within is the estimateof the common variance for all the I populations.sp ooled=qMS(within)Statistics 371, Fall 2004 9Sums of Squares Between (Among)Means• We measure variability by sums of squared deviations. Thesums of squares between groups, or SS(between), is ameasure of the variability among sample means.SS(between) =IXi=1ni(¯yi·− ¯y··)2• Notice that this measure of variability is aweighted sum ofthe deviations of the sample means from the grand mean,weighted by sample size.Statistics 371, Fall 2004 10Degrees of Freedom• The degrees of freedom between samples is simply thenumber of groups minus one.df(between) = I − 1Statistics 371, Fall 2004 11Mean Square BetweenIn ANOVA, a mean square will be the ratio of a sum of squaresover the corresponding degrees of freedom.MS(between) =SS(between)df(between)=(n1− 1)s21+ · · · + (nI− 1)s2In∗− IStatistics 371, Fall 2004 12Total Sum of Squares• If we treated all observations as coming from a singlepopulation (which would be the case if all population meanswere equal and all population standard deviations were equalas well), then it would make sense to measure deviationsfrom the grand mean.• This is the total sum of squares.SS(total) =IXI=1niXj=1(yij− ¯y··)2• It turns out that the total sum of squares can be decomposedinto the sum of squares within and the sum of squaresbetween.SS(total) = SS(within) + SS(between)• Similarly, the total degrees of freedom would be n∗− 1.• There is a similar decomposition.df(total) = df(within) + df(between)n∗− 1 = (n∗− I) + (I − 1)Statistics 371, Fall 2004 13The F Statistic• The F statistic is the ratio of the mean square between overthe mean square within.F =MS(between)MS(within)• If the populations are normal, the population means areall equal, the standard deviations are all equal, and allobservations are independent, then the F statistic hasanF distribution with I − 1 and n∗− I degrees of freedom.• An F distribution is positive and skewed right like thechi-square distribution, but it has two separate degreesof freedom, thenumerator degrees of freedom and thedenominator degrees of freedom.• If X1and X2are independent χ2random variables with k1and k2degrees of freedom respectively, thenF =X1/k1X2/k2has an F distribution with k1and k2degrees of freedom.Statistics 371, Fall 2004 14ANOVA Table for the Cuckoo Example> fit =


View Full Document

UW-Madison STAT 371 - STAT 371 Lecture Notes

Documents in this Course
HW 4

HW 4

4 pages

NOTES 7

NOTES 7

19 pages

Ch. 6

Ch. 6

24 pages

Ch. 4

Ch. 4

10 pages

Ch. 3

Ch. 3

20 pages

Ch. 2

Ch. 2

28 pages

Ch. 1

Ch. 1

24 pages

Ch. 20

Ch. 20

26 pages

Ch. 19

Ch. 19

18 pages

Ch. 18

Ch. 18

26 pages

Ch. 17

Ch. 17

44 pages

Ch. 16

Ch. 16

38 pages

Ch. 15

Ch. 15

34 pages

Ch. 14

Ch. 14

16 pages

Ch. 13

Ch. 13

16 pages

Ch. 12

Ch. 12

38 pages

Ch. 11

Ch. 11

28 pages

Ch. 10

Ch. 10

40 pages

Ch. 9

Ch. 9

20 pages

Ch. 8

Ch. 8

26 pages

Ch. 7

Ch. 7

26 pages

Load more
Download STAT 371 Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view STAT 371 Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view STAT 371 Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?