Unformatted text preview:

7. Comparing Two GroupsTypes of variables and samplesSlide 3se for difference between two estimates (independent samples)Slide 5CI comparing two proportionsExample: College Alcohol Study conducted by Harvard School of Public Health (http://www.hsph.harvard.edu/cas/)Slide 8Comments about CIs for difference between two population proportionsSlide 10Slide 11Quantitative Responses: Comparing MeansExample: GSS data on “number of close friends”Slide 14Significance Tests for m2 - m1Slide 16Slide 17Equivalence of CI and Significance TestAlternative inference comparing means assumes equal population standard deviationsSlide 20Test of H0: m1 = m2 Ha: m1  m2How does software get df for “unequal variance” method?Some comments about comparing meansSlide 24Comparing Means with Dependent SamplesSlide 26Slide 27Slide 28Slide 29Some commentsFinal Comment7. Comparing Two GroupsGoal: Use CI and/or significance test to compare means (quantitative variable) proportions (categorical variable) Group 1 Group 2 EstimatePopulation mean Population proportionWe conduct inference about the difference between the means or difference between the proportions (order irrelevant). 1 2 2 11 2 2 1 ˆ ˆ y ym mp p p p--Types of variables and samples•The outcome variable on which comparisons are made is the response variable.•The variable that defines the groups to be compared is the explanatory variable.Example: Reaction time is response variableExperimental group is explanatory variable (categorical var. with categories cell-phone, control)Or, could express experimental group as “cell-phone use” with categories (yes, no)•Different methods apply for dependent samples -- natural matching between each subject in one sample and a subject in other sample, such as in “longitudinal studies,” which observe subjects repeatedly over time independent samples -- different samples, no matching, as in “cross-sectional studies”Example: We later consider a separate part of the experiment in which the same subjects formed the control group at one time and the cell-phone group at another time.se for difference between two estimates (independent samples)•The sampling distribution of the difference between two estimates is approximately normal (large n1 and n2) and has estimatedExample: Data on “Response times” has 32 using cell phone with mean 585.2, s = 89.6 32 in control group with mean 533.7, s = 65.3What is se for difference between means of 585.2 – 533.7 = 51.4?2 21 2( ) ( )se se se= +(Note larger than each separate se. Why?)So, the estimated difference of 51.4 has a margin of error of about 2( ) = 95% CI is about 51.4 ± , or ( , ).(Good idea to re-do analysis without outlier, to check its influence.)1 1 12 2 22 21 2/ 89.6 / 32/ 65.3/ 32( ) ( )se s nse s nse se se= = == = == + =CI comparing two proportions•Recall se for a sample proportion used in a CI is•So, the se for the difference between sample proportions for two independent samples is •A CI for the difference between population proportions isAs usual, z depends on confidence level, 1.96 for 95% confidenceˆ ˆ(1 ) /se np p= -2 21 2( ) ( )se se se= + =1 1 2 22 11 2ˆ ˆ ˆ ˆ(1 ) (1 )ˆ ˆ( ) zn np p p pp p- -- � +Example: College Alcohol Study conducted by Harvard School of Public Health (http://www.hsph.harvard.edu/cas/)Trends over time in percentage of binge drinking (consumption of 5 or more drinks in a row for men and 4 or more for women, at least once in past two weeks) or activities influenced by it?“Have you engaged in unplanned sexual activities because of drinking alcohol?”1993: 19.2% yes of n = 12,7082001: 21.3% yes of n = 8783What is 95% CI for change saying “yes”?•Estimated change in proportion saying “yes” is 0.213 – 0.192 = 0.021.95% CI for change in population proportion is 0.021 ± 1.96( ) = 0.021 ± We can be 95% confident that …1 1 2 21 2ˆ ˆ ˆ ˆ(1 ) (1 )sen np p p p- -= + =Comments about CIs for difference between two population proportions•If 95% CI for is (0.01, 0.03), then 95% CI for is ( , ). It is arbitrary what we call Group 1 and Group 2 and the order of comparing proportions.•When 0 is not in the CI, we can conclude that one population proportion is higher than the other. (e.g., if all positive values when take Group 2 – Group 1, then conclude population proportion higher for Group 2 than Group 1)2 1p p-1 2p p-•When 0 is in the CI, it is plausible that the population proportions are identical.Example: Suppose 95% CI for change in population proportion (2001 – 1993) is (-0.01, 0.03)“95% confident that population proportion saying yes was between smaller and larger in 2001 than in 1993.”•There is a significance test of H0: 1 = 2 that the population proportions are identical (i.e., difference 1 - 2 = 0), using test statistic z = (difference between sample proportions)/seFor unplanned sex in 1993 and 2001, z = diff./se = 0.021/0.0056 = Two-sided P-value = This seems to be statistical significance without practical significance!Details about test on pp. 189-190 of text; use se0which pools data to get better estimate under H0 (We study this test as a special case of “chi-squared test” in next chapter, which deals with possibly many groups, many outcome categories)•The theory behind the CI uses the fact that sample proportions (and their differences) have approximate normal sampling distributions for large n’s, by the Central Limit Theorem, assuming randomization)•In practice, formula works ok if at least 10 outcomes of each type for each sampleQuantitative Responses: Comparing Means•Parameter: 2 - 1•Estimator: •Estimated standard error:–Sampling dist.: Approximately normal (large n’s, by CLT)–CI for independent random samples from two normal population distributions has form–Formula for df for t-score is complex (later). If both sample sizes are at least 30, can just use z-score2 1y y-2 21 21 2s ssen n= +( ) ( )2 21 22 1 2 11 2 ( ), which is s sy y t se y y tn n- � - � +Example: GSS data on “number of close friends”Use gender as the explanatory variable: 486 females with mean 8.3, s = 15.6 354 males with mean 8.9, s = 15.5Estimated difference of 8.9 – 8.3 = 0.6 has a margin of error of 1.96( ) = , and 95% CI is 0.6 ± , or ( , ).1 1 12 2 22 21 2//( ) (


View Full Document

UF STATISTICS 101 - Comparing Two Groups

Download Comparing Two Groups
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Comparing Two Groups and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Comparing Two Groups 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?