Duke STA 101 - Inference when considering two populations - D754531

Home> Schools> Duke University> Statistical Science (STA) > STA 101> Inference when considering two populations

DOC PREVIEW

Duke STA 101 - Inference when considering two populations

School name Duke University

Course Sta 101- Data Analy/stat Infer

Pages 16

This preview shows page 1-2-3-4-5 out of 16 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

11/5/09 1 Not in FPP Inference when considering two populations Inference for the difference of two parameters  Often we are interested in comparing the population average or the population proportion/percentage for two groups  We can do these types of comparisons using CI’s and hypothesis tests  General ideas and equations don’t change  CI: estimate ± multiplier*SE  Test statistics: (observed– expected)/SE11/5/09 2 Inference for P1 – P2  Lets just jump right into an example CI for P1 – P2  Estimate ± multiplier*SE  Multiplier comes from the z-table  Everything else we know about confidence intervals is the same  Interpretation  What 95% confidence means € ˆ p 1−ˆ p 2± multiplierˆ p 1(1−ˆ p 1)n1+ˆ p 2(1−ˆ p 2)n211/5/09 3 Inference for difference of two population means  Two possibilities in collecting data on two variables here  Design 1: Units are matched in pairs  Use “matched pairs inference”  Design 2: units not matched in pairs  Use “two sample inferences” Typical study designs  Matched pairs  A) two treatments given to each unit  B) units paired before treatments are assigned, then treatments are assigned randomly within pairs  Two samples  A) some units assigned to get only treatment a, and other units assigned to get only treatment b. Assignment is completely at random  B) Units in two different groups compared on some survey variable11/5/09 4 Inference in µ1 – µ2: matched pairs  General idea with matched pairs design is to compute the difference for pair of observations and treat the differences as the single variable  Measure y1 and y2 on each unit. Then for each unit compute  d = y1 – y2  Then find a confidence interval for the difference  difference estimate ± multiplier*SE  average of differences ± t-table value * SD of differences/√n Inference in µ1 – µ2: matched pairs  Do people perform better on tests when smelling flowers versus smelling nothing?  Hirsch and Johnston (1996) asked 21 subjects to work a maze while wearing a mask. The mask was either unscented or carried a floral scent. Each subject worked both mazes. The order of the mask was randomized to ensure fair comparison to the two treatments. The response is the difference in completion times for the unscented and scented masks.  Example: Person 1 completed the maze in 30.60 seconds while wearing the unscented mask, and in 37.97 seconds while wearing the scented mask.  So, this person’s data value is –7.37 (30.60 – 37.97).11/5/09 5 JMP output for odor example  The differences appear to follow the normal curve. There are no outliers  Sample average difference is 0.96, suggesting people do better with scented mask Conclusions from odors example  The 95% CI ranges from -4.76 to 6.67, which is too wide a range to determine whether floral odors help or hurt performance for these mazes. In other words, the data suggest that any effect of scented masks is small enough that we cannot estimate it with reasonable accuracy using these 21 subjects. We should collect more data to estimate the effect of the odor more precisely.  We also note that this study was very specific. The results may not be easily generalized to other populations, other tests, or other treatments.11/5/09 6 Inference in µ1 – µ2: two samples  Pygmalion study  Researchers gave IQ test to elementary school kids.  They randomly picked six kids and told teachers the test predicts these kids have high potential for accelerated growth.  They randomly picked different six kids and told teachers the test predicts these kids have no potential for growth.  At end of school year, they gave IQ test again to all students.  They recorded the change in IQ scores of each student.  Let’s see what they found… EDA for pygmalion study  It looks like being labeled “accelerated” leads to larger improvements than being labeled “no growth”  Let’s make a 99% CI to confirm this11/5/09 7 Sample means and SD’s Level Number Mean SD SE accelerated 6 15.17 4.708 1.92 none 6 6.17 3.656 1.49 Sample difference is 9.00. The SE of this difference: Pygmalion confidence interval  99% CI for difference in mean scores (accel – none)11/5/09 8 Conclusions from the pygmalion study  The 99% CI ranges from 1.20 to 16.80, which is always positive. The data provide evidence that students labeled “accelerated” have higher avg. improvements in IQ than students labeled “no growth.” We are 99% confidence the difference in averages is between 1.2 and 16.8 IQ points. Degrees of Freedom  Use the Welch-Satterhwaite degrees of freedom formula11/5/09 9 National support work demonstration experiment  National supported work demonstration experiment  Recall the NSWD randomly assigned 1600 women to receive job training or not to receive it  Response in salary in the year after the training (1979). Sample means and SD’s Group Number Mean SD SE training 600 4670 5536 226 none 585 3819 5030 208 The sample difference is $851. The SE of this difference is:11/5/09 10 Conclusions from NSW.  95% CI for the difference in average wages: (851 – 1.96 * 307, 851 + 1.96 * 307) = (249, 1453).  Therefore, we conclude that the training increases wages on average relative to no training, with the average increase plausibly between 250 and 1450 dollars. Hypothesis tests for difference of two parameters  The main ideas of hypothesis tests remain the same  1) specify hypothesis  2) compute test statistic (observed – expected)/SE  3)calculate p-value  4)make conclusions11/5/09 11 Hypothesis test for p1 – p2  Herson (1971) examined whether men or women are more likely to suffer from nightmares. He asked a random sample of 160 men and 192 women whether they experienced nightmares “often” (at least once a month) or “seldom” less than once a month  In the sample 55 men (34.4%) and 60 women(31.3%) said they suffered nightmares often. Is this 3.1% difference sufficient evidence

View Full Document