Duke STA 101 - Inference when considering two populations - D2400075

Home> Schools> Duke University> Statistical Science (STA) > STA 101> Inference when considering two populations

DOC PREVIEW

Duke STA 101 - Inference when considering two populations

School name Duke University

Course Sta 101- Data Analy/stat Infer

Pages 41

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Inference when considering two populationsInference for the difference of two parametersInference for P1 – P2CI for P1 – P2Slide 5Inference for difference of two population means μ1 – μ2Typical study designsMatched pairs vs two samplesInference in μ1 – μ2: matched pairsSlide 10JMP output for odor exampleSlide 12Conclusions from odors exampleInference in μ1 – μ2: two samplesEDA for pygmalion studySample means and SD’sPygmalion confidence intervalConclusions from the pygmalion studyDegrees of FreedomSlide 20Hypothesis tests for difference of two parametersHypothesis test for p1 – p2Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Hypothesis test for μ1 – μ2: matched pairsConclusions about odorSlide 31Inference in μ1 – μ2: Two independent samplesSlide 33Matched pairs analysisConclusions from previous exampleMatched pairs cont.Determining a sample sizeDetermining sample sizeDetermine sample size for differences in % and averageSlide 40Determining sample sizes for differences in % and avg.Not completely in FPP but good stuff anywayInference when considering two populationsInference for the difference of two parametersOften we are interested in comparing the population average or the population proportion/percentage for two groupsWe can do these types of comparisons using CI’s and hypothesis testsGeneral ideas and equations don’t changeCI: estimate ± multiplier*SETest statistics: (observed– expected)/SEInference for P1 – P2Lets just jump right into an exampleCI for P1 – P2 Estimate ± multiplier*SEMultiplier comes from the z-tableEverything else we know about confidence intervals is the sameInterpretationWhat does 95% confidence mean€ ˆ p 1−ˆ p 2± multiplierˆ p 1(1−ˆ p 1)n1+ˆ p 2(1−ˆ p 2)n2Inference for difference of two population means μ1 – μ2Two possibilities in collecting data on two variables hereDesign 1: Units are matched in pairsUse “matched pairs inference”Design 2: units not matched in pairsUse “two sample inferences”Typical study designsMatched pairsA) two treatments given to each unitB) units paired before treatments are assigned, then treatments are assigned randomly within pairsTwo samplesA) some units assigned to get only treatment a, and other units assigned to get only treatment b. Assignment is completely at randomB) Units in two different groups compared on some survey variableMatched pairs vs two samplesData collected in two independent samples:No matching, so creating values of some “difference” is meaninglessA “matched pairs” analysis is mathematically wrong and gives incorrect CI’s and p-valuesData collected in matched pairs:Matching, when effective, reduces the SE.A two sample analysis artificially inflates the SE, resulting in excessively wide CI’s and unreliable p-valuesAn example towards the end of these slides will demonstrate thisInference in μ1 – μ2: matched pairsGeneral idea with matched pairs design is to compute the difference for pair of observations and treat the differences as the single variableMeasure y1 and y2 on each unit. Then for each unit computed = y1 – y2Then find a confidence interval for the differencedifference estimate ± multiplier*SEaverage of differences ± t-table value * SD of differences/√nInference in μ1 – μ2: matched pairs Do people perform better on tests when smelling flowers versus smelling nothing?Hirsch and Johnston (1996) asked 21 subjects to work a maze while wearing a mask. The mask was either unscented or carried a floral scent. Each subject worked both mazes. The order of the mask was randomized to ensure fair comparison to the two treatments. The response is the difference in completion times for the unscented and scented masks.Example: Person 1 completed the maze in 30.60 seconds while wearing the unscented mask, and in 37.97 seconds while wearing the scented mask. So, this person’s data value is –7.37 (30.60 – 37.97).JMP output for odor exampleThe differences appear to follow the normal curve. There are no outliersSample average difference is 0.96, suggesting people do better with scented mask.01.05.10.25.50.75.90.95.99-2-10123Normal Quantile Plot-30 -20 -10 0 10 20 30MeanStd DevStd Err Meanupper 95% Meanlow er 95% MeanN0.956666712.5478822.73817236.6683939-4.755061 21MomentsHypothesized ValueActual EstimatedfStd Dev 00.95667 2012.5479Test StatisticProb > |t|Prob > tProb < t 0.3494 0.7305 0.3652 0.6348t TestTest Mean=valueDifferenceDistributionsConclusions from odors exampleThe 95% CI ranges from -4.76 to 6.67, which is too wide a range to determine whether floral odors help or hurt performance for these mazes. In other words, the data suggest that any effect of scented masks is small enough that we cannot estimate it with reasonable accuracy using these 21 subjects. We should collect more data to estimate the effect of the odor more precisely.We also note that this study was very specific. The results may not be easily generalized to other populations, other tests, or other treatments.Inference in μ1 – μ2: two samplesPygmalion studyResearchers gave IQ test to elementary school kids.They randomly picked six kids and told teachers the test predicts these kids have high potential for accelerated growth.They randomly picked different six kids and told teachers the test predicts these kids have no potential for growth. At end of school year, they gave IQ test again to all students. They recorded the change in IQ scores of each student.Let’s see what they found…EDA for pygmalion studyIt looks like being labeled “accelerated” leads to larger improvements than being labeled “no growth”Let’s make a 99% CI to confirm thisImprovement05101520accelerated noneGrow th GroupSample means and SD’s Level Number Mean SD SE accelerated 6 15.17 4.708 1.92 none 6 6.17 3.656 1.49Sample difference is 9.00. The SE of this difference:43.22222212121 SESEnSDnSDSEPygmalion confidence interval99% CI for difference in mean scores (accel – none):Estimate ± mulitplier*SEEstimate is mean1 – mean2Multiplier comes from the t-table (we will talk about df in a sec.)SE of difference from the previous slide€ € x 1− x 1±

View Full Document