Independent Samples Comparing Lecture 41 Proportions Section 11 5 Fri Apr 14 2006 Comparing Proportions We now wish to compare proportions between two populations Normally we would be measuring proportions for the same attribute For example we could measure the proportion of NC residents living below the poverty level and the proportion of VA residents living below the poverty level Examples The gender gap the proportion of men who vote Republican vs the proportion of women who vote Republican The proportion of teenagers who smoked marijuana in 1995 vs the proportion of teenagers who smoked marijuana in 2000 Examples The proportion of patients who recovered given treatment A vs the proportion of patients who recovered given treatment B Treatment A could be a placebo Comparing proportions To estimate the difference between population proportions p1 and p2 we need the sample proportions p1 and p2 The difference p1 p2 is an estimator of the difference p1 p2 Hypothesis Testing See Example 11 8 p 721 Perceptions of the U S Canadian versus French p1 proportion of Canadians who feel positive about the U S p2 proportion of French who feel positive about the U S Hypothesis Testing The hypotheses H0 p1 p2 0 i e p1 p2 H1 p1 p2 0 i e p1 p2 The significance level is 0 05 What is the test statistic That depends on the sampling distribution of p1 p2 The Sampling Distribution of p 1 p 2 If the sample sizes are large enough then p1 is N p1 1 where p 1 p1 1 1 n1 Similarly p2 is N p2 2 where p2 1 p2 2 n2 The Sampling Distribution of p1 p2 The sample sizes will be large enough if n1p1 5 and n1 1 p1 5 and n2p2 5 and n2 1 p2 5 The Sampling Distribution of p 1 p 2 Therefore p1 p 2 is N p1 p2 12 2 2 where 2 1 2 2 2 1 2 p1 1 p1 p2 1 p2 n1 n2 2 p1 1 p1 p2 1 p2 n1 n2 The Test Statistic Therefore the test statistic would be p 1 p 2 0 Z p1 1 p1 p2 1 p2 n1 n2 if we knew the values of p1 and p2 We could estimate them with p1 and p2 But there may be a better way Pooled Estimate of p In hypothesis testing for the difference between proportions typically the null hypothesis is H0 p1 p2 Under that assumption p1 and p2 are both estimators of a common value call it p Pooled Estimate of p Rather than use either p1 or p2 alone to estimate p we will use a pooled estimate The pooled estimate is the proportion that we would get if we pooled the two samples together Pooled Estimate of p The Batting Average Formula x1 p 1 x1 n1 p 1 n1 x2 p 2 x2 n2 p 2 n2 x1 x2 n1 p 1 n2 p 2 p n1 n2 n1 n2 The Standard Deviation of p1 p2 This leads to a better estimator of the standard deviation of p1 p2 2 1 2 2 2 1 1 p 1 p n1 n2 1 2 2 1 1 p 1 p n1 n2 Caution If the null hypothesis does not say H0 p1 p2 then we should not use the pooled estimate p but should use the unpooled estimate p 1 1 p 1 p 2 1 p 2 p 1 p 2 n1 n2 The Test Statistic So the test statistic is Z p 1 p 2 0 1 1 p 1 p n1 n2 The Value of the Test Statistic Compute p 366 611 977 p 0 2741 1017 2547 3564 Now compute z 0 3599 0 2399 0 12 Z 7 253 1 0 01655 1 0 2741 0 7259 1017 2547 The p value etc Compute the p value P Z 7 253 2 059 10 13 Reject H0 The data indicate that a greater proportion of Canadians than French have a positive feeling about the U S Exercise 11 34 Sample 1 361 Wallace cars reveal that 270 have the sticker Sample 2 178 Humphrey cars reveal that 154 have the sticker Do these data indicate that p1 p2 Example State the hypotheses H0 p1 p2 H1 p1 p2 State the level of significance 0 05 Example Write the test statistic z p 1 p 2 0 1 1 p 1 p n1 n2 Example Compute p1 p2 and p 270 p 1 0 7479 361 154 p 2 0 8652 178 270 154 424 p 0 7866 361 178 539 Example Now we can compute z 0 7479 0 8652 0 1173 Z 3 126 1 0 03752 1 0 7866 1 0 7866 361 178 Example Compute the p value p value 2 normalcdf E99 3 126 2 0 0008861 0 001772 The decision is to reject H0 State the conclusion The data indicate that the proportion of Wallace cars that have the sticker is different from the proportion of Humphrey cars that have the sticker TI 83 Testing Hypotheses Concerning p1 p2 Press STAT TESTS 2PropZTest Enter x1 n1 x2 n2 Choose the correct alternative hypothesis Select Calculate and press ENTER TI 83 Testing Hypotheses Concerning p1 p2 In the window the following appear The title The alternative hypothesis The value of the test statistic z The p value p1 p2 The pooled estimate p n1 n2 Example Work Exercise 11 34 using the TI 83 Confidence Intervals for p1 p2 The formula for a confidence interval for p1 p2 is p 1 p 2 z p1 1 p1 p2 1 p2 n1 n2 Caution Note that we do not use the pooled estimate for p TI 83 Confidence Intervals for p1 p2 Press STAT TESTS 2 PropZInt Enter x1 n1 x2 n2 The confidence level Select Calculate and press ENTER TI 83 Confidence Intervals for p1 p2 In the window the following appear The title The confidence interval p1 p2 n1 n2 Example Find a 95 confidence interval using the data in Exercise 11 34
View Full Document