Inference for ProportionsInference for ProportionsInference for a Single ProportionSection 8.1Analysis of categorical dataObjective: One- and two-sample analysis of data on categorical variablesg Data are counts or percents Ex: percent of Virginians who favor Issue XParameters are population proportionsParameters are population proportions Ex: percent of all Virginians Estimates are sample proportions Ex: percent of Virginians in an opinion pollSampling framework for proportions Population proportion: the proportion of “successes” in the populationpp Denote by p Sample success count: the number of “successes” in the sample Denote by Xy Sample proportion: proportion of “successes” in the sample Denote byProperties of the sample proportionSuppose the sample is selected by SRS. Some properties of are:pp Mean: Standard deviation: Approximate Normality: if the sample size, n, is large thenis approximatelyOne-sample z test for proportions Assumptions: Large SRS Hypotheses: H0: p = p0versus a one- or two-sided Ha Test statistic: P-value: P(Z≤z) for Ha: p < p0P(Z ≤ -z) for Ha: p > p02P(Z≤||)fH2P(Z≤-|z|)for Ha: p≠p0 ROT: Valid if both andExample: Coin flippingAre coin flips really fair? While a POW in WWII, John Kerrich flipped a coin 10,000 times and observed 5067 pp ,headsHypotheses:H0:p=0.5versusHa:p≠0.5Hypotheses:H0: p 0.5 versusHa: p≠0.5Summary statistic: Test statistic:Example: Coin flipping (continued)Test statistic: z = 1.34P-value: 2P(Z≤-1.34) = 0.180Decision:Accept H0at significance level α= 0.05, and p0gconclude that coin flips are fairROT9ROT: 99One-sample z confidence interval for proportions Assumptions: Large SRS Target parameter: p CI formula: For confidence level C, the interval ish*i hth tP(Z≤*)(1C)/2where z* is such that P(Z≤-z*) = (1 –C)/2ROT:Valid if andROT: Valid if andExample: Work-stress impactHow much does work-stress affect one’s personal life? Each in a survey of n = 100 restaurant workers what yasked “Does work-stress have a negative impact on your personal life?” The count of “Yes” responses is X = 68.Data: SRS of size n = 100 (large)Summary statistic:Example: Work-stress impact (continued)95% CI: P(Z ≤ -1.96) = 0.025 ⇒ z* = 1.96, and the CI isConclude that between 59% and 77% of all restaurantConclude that between 59% and 77% of all restaurant workers would answer “Yes, work-stress has a negative impact on my personal life”ROT:999Standard deviationsNeither the test and CI procedures directly use The test substitutes its “null” value,Th CI b tit t th l t d dThe CI substitutes the analogous standard error,AccuracyBoth the test and CI rely on rules of thumb for accuracy: The test is valid if and The CI is valid if andAlternative procedures are available: Use specialized methods for alternative tests Alt tiCIiAn alternative CI is…One-sample plus-four confidence interval Assumptions: Large SRSWilson estimate of p Target parameter: p CI formula: For confidence level C, the interval iswhereand z* is such that P(Z ≤ -z*) = (1 – C)/2 ROT: Valid if n ≥ 10Selecting a sample sizeA sample size may be chosen to target a desired margin of errorMargin of errorm Desired confidence level, C, provides z*Margin of error, m p* is an educated guess of the true p A conservative setting is p* = 1/2 Yields a larger n than other p*Example: Work-stress impact (continued)In a new study of work-stress impact, what sample size is needed for a margin of error no more than m = 0.05, with g,95% confidence? (i.e., C = 0.95 ⇒ z* = 1.96)Setp*=0.75(From national study)Set p 0.75 (From national study) Set p* = 0.50 (Conservative)Inference for ProportionsInference for ProportionsComparing Two ProportionsSection 8.2Categorical data in comparative experiments Use specialized methods matched pairs experiments Here, focus on two-sample setup:Population 1Population 2Population 1Population 2p1p2⇓⇓⇓⇓ence⇓⇓⇓⇓dependen1n2X1X2IndProperties of the sample proportionStart by estimating p1– p2with Mean:Standard de iationStandard deviation:When p= p1= p2, this ispp1p2,Two-sample z test for proportions Assumptions: Large, independent SRSs drawn from distinct populations Hypotheses: H0: p1= p2versus a one- or two-sided HaTest statistic:Test statistic:, where P-value: P(Z ≤ z) for Ha: p1< p2P(Z≤-z)forHa:p1>p2P(Z≤z) for Ha: p1 p22P(Z ≤ -|z|)for Ha: p1≠ p2ROT:Valid ifROT:Valid ifExample: Gender and garment labelsDo the genders respond differently to “No Sweat” garment labels? An a study of buying behavior, n1= 296 gyyg,1women and n2= 251 men were interviewed and classified as likely or unlikely to be influenced by the presence of a “No Sweat”garment label Among them:No Sweat garment label. Among them:X1= 63 women are “likely” ⇒X2= 27 men are “likely” ⇒Want to test for gender differences:H0:p1=p2versusHa:p1≠p2H0: p1 p2versusHa: p1≠p2Example: Gender and garment labels (continued)Data: Independent SRS of sizes n1= 296 and n2= 251 from distinct populations Hypotheses: H0: p1= p2versus Ha: p1≠ p2Summary statistics:andSummary statistics:andTest statistic:⇒⇒Example: Gender and garment labels (continued)Test statistic: z = 3.31P-value: 2P(Z≤-3.31) = 0.001Decision: Reject H0at significance level α= 0.05, and j0gconclude that genders respond differently to “No Sweat” garment labels9ROT: 99999Two-sample z confidence interval for proportions Assumptions: Large, independent SRSs drawn from distinct populations Target parameter: p1– p2CI f lFfidllCth i t l iCI formula:For confidence level C, the interval iswhere z* is such that P(Z≤-z*) = (1 –C)/2eessuc t at()(C)/ ROT: Valid ifExample: Gender and garment labels (continued)How big is the gender difference in responses to “No Sweat” garment labels?gAnswer with a confidence interval for p1– p295% CI: P(Z ≤ -1.96) = 0.025 ⇒ z* = 1.96, and the CI isExample: Gender and garment labels (continued)95% CI: 0.11 ± 0.06 = (0.04, 0.17)Conclude a gender difference of between 4% and 17% in the percent likely to be influenced by the presence of a “No Sweat”garment labelNo Sweat garment labelROT: 99Standard deviationsNeither the test and CI procedures directly use The test substitutes the std. error of its “null” value,a.k.a., “pooled”
View Full Document