DOC PREVIEW
UVA STAT 2120 - Topic_09 (1)

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Inference for ProportionsInference for ProportionsInference for a Single ProportionSection 8.1Analysis of categorical dataObjective: One- and two-sample analysis of data on categorical variablesg Data are counts or percents Ex: percent of Virginians who favor Issue XParameters are population proportionsParameters are population proportions Ex: percent of all Virginians Estimates are sample proportions Ex: percent of Virginians in an opinion pollSampling framework for proportions Population proportion: the proportion of “successes” in the populationpp Denote by p Sample success count: the number of “successes” in the sample Denote by Xy Sample proportion: proportion of “successes” in the sample Denote byProperties of the sample proportionSuppose the sample is selected by SRS. Some properties of are:pp Mean: Standard deviation: Approximate Normality: if the sample size, n, is large thenis approximatelyOne-sample z test for proportions Assumptions: Large SRS Hypotheses: H0: p = p0versus a one- or two-sided Ha Test statistic: P-value: P(Z≤z) for Ha: p < p0P(Z ≤ -z) for Ha: p > p02P(Z≤||)fH2P(Z≤-|z|)for Ha: p≠p0 ROT: Valid if both andExample: Coin flippingAre coin flips really fair? While a POW in WWII, John Kerrich flipped a coin 10,000 times and observed 5067 pp ,headsHypotheses:H0:p=0.5versusHa:p≠0.5Hypotheses:H0: p 0.5 versusHa: p≠0.5Summary statistic: Test statistic:Example: Coin flipping (continued)Test statistic: z = 1.34P-value: 2P(Z≤-1.34) = 0.180Decision:Accept H0at significance level α= 0.05, and p0gconclude that coin flips are fairROT9ROT: 99One-sample z confidence interval for proportions Assumptions: Large SRS Target parameter: p CI formula: For confidence level C, the interval ish*i hth tP(Z≤*)(1C)/2where z* is such that P(Z≤-z*) = (1 –C)/2ROT:Valid if andROT: Valid if andExample: Work-stress impactHow much does work-stress affect one’s personal life? Each in a survey of n = 100 restaurant workers what yasked “Does work-stress have a negative impact on your personal life?” The count of “Yes” responses is X = 68.Data: SRS of size n = 100 (large)Summary statistic:Example: Work-stress impact (continued)95% CI: P(Z ≤ -1.96) = 0.025 ⇒ z* = 1.96, and the CI isConclude that between 59% and 77% of all restaurantConclude that between 59% and 77% of all restaurant workers would answer “Yes, work-stress has a negative impact on my personal life”ROT:999Standard deviationsNeither the test and CI procedures directly use  The test substitutes its “null” value,Th CI b tit t th l t d dThe CI substitutes the analogous standard error,AccuracyBoth the test and CI rely on rules of thumb for accuracy:  The test is valid if and The CI is valid if andAlternative procedures are available: Use specialized methods for alternative tests Alt tiCIiAn alternative CI is…One-sample plus-four confidence interval Assumptions: Large SRSWilson estimate of p Target parameter: p CI formula: For confidence level C, the interval iswhereand z* is such that P(Z ≤ -z*) = (1 – C)/2 ROT: Valid if n ≥ 10Selecting a sample sizeA sample size may be chosen to target a desired margin of errorMargin of errorm Desired confidence level, C, provides z*Margin of error, m p* is an educated guess of the true p A conservative setting is p* = 1/2  Yields a larger n than other p*Example: Work-stress impact (continued)In a new study of work-stress impact, what sample size is needed for a margin of error no more than m = 0.05, with g,95% confidence? (i.e., C = 0.95 ⇒ z* = 1.96)Setp*=0.75(From national study)Set p 0.75 (From national study) Set p* = 0.50 (Conservative)Inference for ProportionsInference for ProportionsComparing Two ProportionsSection 8.2Categorical data in comparative experiments Use specialized methods matched pairs experiments Here, focus on two-sample setup:Population 1Population 2Population 1Population 2p1p2⇓⇓⇓⇓ence⇓⇓⇓⇓dependen1n2X1X2IndProperties of the sample proportionStart by estimating p1– p2with Mean:Standard de iationStandard deviation:When p= p1= p2, this ispp1p2,Two-sample z test for proportions Assumptions: Large, independent SRSs drawn from distinct populations Hypotheses: H0: p1= p2versus a one- or two-sided HaTest statistic:Test statistic:, where P-value: P(Z ≤ z) for Ha: p1< p2P(Z≤-z)forHa:p1>p2P(Z≤z) for Ha: p1 p22P(Z ≤ -|z|)for Ha: p1≠ p2ROT:Valid ifROT:Valid ifExample: Gender and garment labelsDo the genders respond differently to “No Sweat” garment labels? An a study of buying behavior, n1= 296 gyyg,1women and n2= 251 men were interviewed and classified as likely or unlikely to be influenced by the presence of a “No Sweat”garment label Among them:No Sweat garment label. Among them:X1= 63 women are “likely” ⇒X2= 27 men are “likely” ⇒Want to test for gender differences:H0:p1=p2versusHa:p1≠p2H0: p1 p2versusHa: p1≠p2Example: Gender and garment labels (continued)Data: Independent SRS of sizes n1= 296 and n2= 251 from distinct populations Hypotheses: H0: p1= p2versus Ha: p1≠ p2Summary statistics:andSummary statistics:andTest statistic:⇒⇒Example: Gender and garment labels (continued)Test statistic: z = 3.31P-value: 2P(Z≤-3.31) = 0.001Decision: Reject H0at significance level α= 0.05, and j0gconclude that genders respond differently to “No Sweat” garment labels9ROT: 99999Two-sample z confidence interval for proportions Assumptions: Large, independent SRSs drawn from distinct populations Target parameter: p1– p2CI f lFfidllCth i t l iCI formula:For confidence level C, the interval iswhere z* is such that P(Z≤-z*) = (1 –C)/2eessuc t at()(C)/ ROT: Valid ifExample: Gender and garment labels (continued)How big is the gender difference in responses to “No Sweat” garment labels?gAnswer with a confidence interval for p1– p295% CI: P(Z ≤ -1.96) = 0.025 ⇒ z* = 1.96, and the CI isExample: Gender and garment labels (continued)95% CI: 0.11 ± 0.06 = (0.04, 0.17)Conclude a gender difference of between 4% and 17% in the percent likely to be influenced by the presence of a “No Sweat”garment labelNo Sweat garment labelROT: 99Standard deviationsNeither the test and CI procedures directly use  The test substitutes the std. error of its “null” value,a.k.a., “pooled”


View Full Document

UVA STAT 2120 - Topic_09 (1)

Download Topic_09 (1)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Topic_09 (1) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Topic_09 (1) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?