DOC PREVIEW
UW-Madison STAT 371 - two-sample-handout

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

IntroductionSESampling DistributionEstimationExampleEqual VariancesTestsAssumptionsTwo Independent SamplesBret LargetDepartments of Botany and of StatisticsUniversity of Wisconsin—MadisonStatistics 37125th October 2005Comparing Two GroupsIChapter 7 describes two ways to compare two populations onthe basis of independent samples:a confidence interval for thedifference in population meansand a hypothesis test.IThe basic structure of the confidence interval is the same asin the previous chapter — an estimate plus or minus amultiple of a standard error.IHypothesis testing will introduce several new concepts.SettingIModel two populations as buckets of numbered balls.IThe population means are µ1and µ2, respectively.IThe population standard deviations are σ1and σ2,respectively.IWe are interested in estimating µ1− µ2and in testing thehypothesis thatµ1= µ2.meanµ1sdσ1y1(1) ,..., yn1(1)y1s1meanµ2sdσ2y1(2) ,..., yn2(2)y2s2Standard Error of ¯y1− ¯y2IThe standard error of the difference in two sample means is anempirical measure of how far the difference in sample meanswill typically be from the difference in the respectivepopulation means.SE(¯y1− ¯y2) =ss21n1+s22n2IAn alternative formula isSE(¯y1− ¯y2) =q(SE(¯y1))2+ (SE(¯y2))2IThis formula reminds us of how to find the length of thehypotenuse of a triangle.I(Variances add, but standard deviations don’t.)Pooled Standard ErrorIIf we wish to assume that the two population standarddeviations are equal, σ1= σ2, then it makes sense to use datafrom both samples to estimate the common populationstandard deviation.IWe estimate the common population variance with a weightedaverage of the sample variances, weighted by the degrees offreedom.s2pooled=(n1− 1)s21+ (n2− 1)s22n1+ n2− 2IThe pooled standard error is then as below.SEpooled= spooledr1n1+1n2Sampling DistributionsThe sampling distribution of the difference in sample means hasthese characteristics.IMean: µ1− µ2ISD:qσ21n1+σ22n2IShape: Exactly normal if both populations are normal,approximately normal if populations are not normal but bothsample sizes are sufficiently large.Theory for Confidence IntervalIThe recipe for constructing a confidence interval for a singlepopulation mean is based on facts about the samplingdistribution of the statisticT =¯Y −µSE(¯Y ).ISimilarly, the theory for confidence intervals for µ1− µ2isbased on the sampling distribution of the statisticT =(¯Y1−¯Y2) − (µ1− µ2)SE(¯Y1−¯Y2)where westandardize by subtracting the mean and dividing bythe standard deviation of the sampling distribution.Theory (cont.)IIf both populations are normal and if we know the populationstandard deviations, thenP−1.96 ≤(¯Y1−¯Y2) − (µ1− µ2)qσ21n1+σ22n2≤ 1.96= 0.95where we can choose z other than 1.96 for differentconfidence levels.IThis statement is true because the expression in the middlehas a standard normal distribution.Theory (cont.)IBut in practice, we don’t know the population standarddeviations.IIf we substitute in sample estimates instead, we get this.P−t ≤(¯Y1−¯Y2) − (µ1− µ2)qs21n1+s22n2≤ t= 0.95IWe need to choose different end points to account for theadditional randomness in the denominator.IIt turns out that the sampling distribution of the statisticabove isapproximately a t distribution where the degrees offreedom should be estimated from the data as well.Theory (cont.)IAlgebraic manipulation leads to the following expression.P8<:(¯Y1−¯Y2) − tss21n1+s22n2≤ µ1− µ2≤ (¯Y1−¯Y2) + tss21n1+s22n29=;= 0.95IWe use a t multiplier so that the area between −t and tunder a t distribution with the estimated degrees of freedomwill be 0.95.Confidence Interval for µ1− µ2IThe confidence interval for differences in population meanshas thesame structure as that for a single population mean.(Estimate) ± (t Multiplier) ×SEIThe only difference is that for this more complicated setting,we havemore complicated formulas for the standard error andthe degrees of freedom.IHere is the df formula.df =(SE21+ SE22)2SE41/(n1− 1) + SE42/(n2− 1)where SEi= si/√nifor i = 1, 2.IAs a check, the value is often close to n1+ n2− 2. (This willbe exact if s1= s2and if n1= n2.)IThe value from the messy formula will always be between thesmaller of n1− 1 and n2− 1 and n1+ n2− 2.Example — Exercise 7.12IIn this example, subjects with high blood pressure arerandomly allocated to two treatments.IThe biofeedback group receives relaxation training aided bybiofeedback and meditation over eight weeks.IThe control group does not.IReduction in systolic blood pressure is tabulated here.Biofeedback Controln 99 93¯y 13.8 4.0SE 1.34 1.30Example (cont.)IFor 190 degrees of freedom (which come from both the simpleand messy formulas) the table says to use 1.977 (140 isrounded down) whereas with R you find 1.973.IA calculator or R can compute the margin of error.> se = sqrt(1.34^2 + 1.3^2)> tmult = qt(0.975, 190)> me = round(tmult * se, 1)> se[1] 1.866976> tmult[1] 1.972528> me[1] 3.7We are 95% confident that the mean reduction in systolicblood pressure due to the biofeedback treatment in apopulation of similar individuals to those in this studywould be between 6.1 and 13.5 mm more than the meanreduction in the same population undergoing the controltreatment.Example Using RExercise 7.21IThis exercise examines the growth of bean plants under redand green light.IRead in the data.> ex7.21 = read.table("lights.txt", header = T)IExamine the structure of the data.> str(ex7.21)‘data.frame’: 42 obs. of 2 variables:$ height: num 8.4 8.4 10 8.8 7.1 9.4 8.8 4.3 9 8.4 ...$ color : Factor w/ 2 levels "green","red": 2 2 2 2 2 2 2 2 2 2 ...Example (cont.)IExamine side-by-side boxplots.> attach(ex7.21)> boxplot(split(height, color))green red5 6 7 8 9 10Example (cont.)ICarry out t-test.> t.test(height ~ color)Welch Two Sample t-testdata: height by colort = 1.1432, df = 38.019, p-value = 0.2601alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:-0.4479687 1.6103216sample estimates:mean in group green mean in group red8.940000 8.358824Example Assuming Equal VariancesIFor the same data, were we to assume that the populationvariances were equal, the degrees of freedom, the standarderror, and the confidence interval are all slightly different.> t.test(height ~ color, var.equal = T)Two Sample t-testdata: height by colort = 1.1064, df =


View Full Document

UW-Madison STAT 371 - two-sample-handout

Documents in this Course
HW 4

HW 4

4 pages

NOTES 7

NOTES 7

19 pages

Ch. 6

Ch. 6

24 pages

Ch. 4

Ch. 4

10 pages

Ch. 3

Ch. 3

20 pages

Ch. 2

Ch. 2

28 pages

Ch. 1

Ch. 1

24 pages

Ch. 20

Ch. 20

26 pages

Ch. 19

Ch. 19

18 pages

Ch. 18

Ch. 18

26 pages

Ch. 17

Ch. 17

44 pages

Ch. 16

Ch. 16

38 pages

Ch. 15

Ch. 15

34 pages

Ch. 14

Ch. 14

16 pages

Ch. 13

Ch. 13

16 pages

Ch. 12

Ch. 12

38 pages

Ch. 11

Ch. 11

28 pages

Ch. 10

Ch. 10

40 pages

Ch. 9

Ch. 9

20 pages

Ch. 8

Ch. 8

26 pages

Ch. 7

Ch. 7

26 pages

Load more
Download two-sample-handout
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view two-sample-handout and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view two-sample-handout 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?