Two-Sample Tests for BinomialProportionsSuppose the variable under study is not continuous but instead is classified into categoriesExamples: 1. Proportion of students who are males at SPHTM compared to the SOM2. Proportion of families with incomes below the US poverty line in Louisiana, compared to Mississippi, compared to Texas3. Women who are disease free at baseline. After 10 years, compare the incidence of cervical cancer for those who use oral contraceptives compared to those who do not use oral contraceptives.For first example:1. Let p1 = probability that a student at SPHTM is male2. Let p2 = probability that a student at SOM is maleThe question is whether the probability of being a male is the same in the two groups. H0: p1 = p2 = pH1: p1 p2We will analyze by two approaches1. Normal-theory methods2. Contingency tablesWe will show the two approaches are equivalent and yield the same p-values.Normal-Theory MethodSignificance test based on 21ppˆˆSamples must be large enough so that normal approximation to the binomial distribution is valid. 55 q*p*nandq*p*nˆˆˆˆ21Where, 21212122111100nn,pqnpqnpq,~ppp,p q/n~p)(p,pq/n~pNorNNNˆˆˆˆTherefore, ),(~nnpqppz 101102121NˆˆNote: p and q are unknown and must be estimated. The best estimator for p is a weighted average of the sample proportions.pp21ˆˆand2121212211nnxxnnpnpnpˆˆˆwhere x1 = observed number of events in the first sampleand x2 = observed number of events in the second sampleContinuity Correction:2121212121212121nn,ppIfnn,ppIfaddsubtractˆˆˆˆor to simplify, rewrite numerator as21212121nn|pp|ˆˆand reject H0 only for large positive values of z.Two-sample test for Binomial Proportions withContinuity CorrectionH0: p1 = p2H1: p1 p2pqnnxxnnpnpnpnnqpnn|pp|zˆˆˆˆˆˆˆˆˆ11121212121212211212121andwherefor a two-tailed test at level , ifz > z1-/2 reject H0. If z z1-/2 accept H0.p-value: p = 2[1-(z)]Example: Suppose we sample 100 students in the SPHTM and 40 are male. Suppose we sample60 students in the SOM and 30 are male. Use a two-tailed test to test if the proportions of students who are male are different in the twoschools. Conduct the test at the 5% level of significance.56250437501ˆ1ˆ43750601003040ˆ5006030ˆ40010040ˆ212121..pq.nnxxp.p.pNote: 766.14562504375060ˆˆ609.245625043750100ˆˆ21.*.*q*p*n.*.*q*p*nReject H0 if z > z1-/2 = z.975 = 1.96Note: We need only the upper tail since we are using a modified test statistic that uses the absolute value |ˆˆ|21pp .07.10810093086670065625001333106011001562504375012012001504011ˆˆ2121ˆˆ212121...)(.|.|.*.|..|nnqpnn|pp|zReject H0.p-value = (0.1423)*2 = 0.2846We can conclude that the proportion of males in the two schools is the same.This is a test to compare two proportions in two populations. This is also called a test of homogeneity.EXAMPLE: In an air pollution study, a random sample of 200 households was selected from each of two communities. A respondent in each household was asked whether or not anyone in the household was bothered by air pollution. In Community I, 43 respondents indicated someone in the household was bothered by air pollution. In Community II, 81 respondents indicated someone in the household was bothered by air pollution. Canthe researchers conclude that the two proportions of households with someone bothered by air pollution significantly differ or is the difference simply due to chance? Conduct the test at the 5% level of significance.H0: p1 = p2H1: p1 p2690310113104001242002008143405020081215020043212121..ˆˆ.ˆ.ˆ.ˆpqnnxxpppNote: 5784269031020021 ..*.*ˆ*ˆ*ˆ*ˆ* qpnqpnReject H0 if z >z1-/2 = z.975 = 1.96Reminder: We need only the upper tail since we are using a modified test statistics that uses the absolute value, |ˆˆ|21pp . 004046249185012139005192001200169312002120021405215112121212121....*..|.|.*.**|..|ˆˆ||nnqpnnppzReject H0. p-value = p<0.0001 *2 = p<0.0002We can conclude that the proportion of households with someone bothered by air pollution is not the same in the two
View Full Document