DOC PREVIEW
UW-Madison STAT 371 - Chapter 10 Analysis of Categorical Data

This preview shows page 1-2-3-4-5-6-41-42-43-44-45-46-83-84-85-86-87-88 out of 88 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 10 Analysis of Categorical DataFall 2011IntroductionCase StudyExampleIn Costa Rica, the vampire bat Desmodus rotundus feeds on theblood of domestic cattle. If the bats respond to a hormonal signal,cows in estrous (in heat) may be bitten with a different probabilitythan cows not in estrous.In estrous Not in estrous TotalBitten by a bat 15 6 21Not bitten by a bat 7 322 329Total22 328 350The proportion of bitten cows among those in estrous is15/22 = .682 while the proportion of bitten cows among those notin estrous is 6/328 = .018.Case StudyExampleIn an experiment fish are placed in a large tank for a period of timeand some are eaten by large birds of prey. The fish are categorizedby their level of parasitic infection, either uninfected, lightlyinfected, or highly infected. It is to the parasites’ advantage to bein a fish that is eaten, as this provides an opportunity to infect thebird in the parasites’ next stage of life. The observed proportionsof fish eaten are quite different among the categories.Uninfected Lightly Infected Highly Infected TotalEaten 1 10 37 48Not eaten 49 35 9 93Total50 45 46 141The proportions of eaten fish are, respectively, 1/50 = .02,10/45 = .222, and 37/46 = .804.OverviewWe will study this data from two different points of view1Two sample problem, p1versus p21p1− p22Relative Risk, Odss Ratio2Chi-square Test1Go odness of fit test2Test for association between two categorical variables.Part I. Two Sample Problem10.7 Confidence Interval for p1− p2Setup1Population picture - board2Sampled data:Sample 1 Sample 2Condition y1y2Not n1− y1n2− y2n1n2Recall: One Sample Confidence IntervalWe pretend we have 4 more observations (i.e. sample size is n + 4)and that out of those 4 extra observations, there are 2 successesand 2 failures (i.e. # successes is Y + 2).Ëœp =y + 2n + 4and SEËœp=rËœp(1 − Ëœp)n + 4A 95% confidence interval for p isËœp ± 1.96 SEËœpConfidence Interval for p1− p2Again, we pretend that we have14 more observations. Split between between the two samples,n1+ 2 and n2+ 22Out of the 4 extra observations, there are 2 successes. Splitbetween the two samples, y1+ 1, y2+ 1Ëœp1=y1+ 1n1+ 2, Ëœp2=y2+ 1n2+ 2Confidence Interval for p1− p2A confidence interval for p1− p2isËœp1− Ëœp2± zα/2SEËœp1−˜p2Ëœp1=y1+ 1n1+ 2, Ëœp2=y2+ 1n2+ 2SEËœp1−˜p2=sËœp1(1 − Ëœp1)n1+ 2+Ëœp2(1 − Ëœp2)n2+ 2CommentThis confidence interval formula is used for all confidencelevels, not just 95%Example - on the boardFind a 95% confidence interval for the difference inprobabilities of being bitten by a vampire bat between cows inestrous and those not.Example - InterpretationIn the study setting in Costa Rica, we are 95% confident that theprobability that a cow in estrous is bitten by a vampire bat is largerthan the probability of cow not in estrous being bitten by anamount between .456 and .83510.9 Relative Risk and the Odds RatioIntroductionp1− p2provides information about the magnitude of thedifference between p1and p2There are other ways compare these values, e.g. the ratioRelative RiskRelative risk is a ratio of two probabilities, both of the same event,but under different conditions,p1p2For example, if the probability of a low birthweight baby given thatthe mother is a smoker is twice as high as if the mother is anonsmoker, the relative risk of low birthweight for smokers relativeto nonsmokers is 2.Estimate the relative riskThis is simply the estimated proportionsˆp1ˆp2Example - Bat BitesIn estrous Not in estrous TotalBitten by a bat 15 6 21Not bitten by a bat 7 322 329Total 22 328 350ˆp1= 15/22 = .682, ˆp2= 6/328 = .018The estimated relative riskˆp1ˆp2=.682.018= 37.88Thus, we estimate that the risk of being bitten is more than 37times greater for cows in estrous versus cows not in estrous.The Odds RatioOdds ratios are another way to compare probabilities.If the probability of an event E is Pr {E},the odds of event E =Pr {E }1 − Pr {E }.The odds ratio of two events, often denoted θ, is the ratio of theodds. So, the odds ratio for events with probabilities p1and p2isp1/(1 − p1)p2/(1 − p2)=p1(1 − p2)(1 − p1)p2Comparing Relative Risk and Odds RatiosRelative risk and odds ratios are not identical, but are similar toone another. The exact relationship is this.odds ratio =p1p2Ă—1 − p21 − p1= relative risk Ă—1 − p21 − p1These will be very close when p1and p2are both small.Example - Bat Bitesˆp1= 15/22 = .682, ˆp2= 6/328 = .018The estimated odds of being bitten are.6821 − .682= 2.142 among cows in estrous,.0181 − .018= .0186 among cows in estrousThe estimated odds ratios isˆθ =2.142.0186≈ 115Thus, we estimate that the odds of being bitten is 115 timesgreater for cows in estrous versus cows not in estrous.Confidence Interval for the Odds RatioThe sampling distribution ofˆθ is not normal.But the sampling distribution of log(ˆθ) is approximatelynormal.So we first, compute a confidence interval for log(θ)And then transform back (exponentiate) to get a confidenceinterval for θ.Confidence Interval for the Odds RatioThe sampling distribution ofˆθ is not normal.But the sampling distribution of log(ˆθ) is approximatelynormal.So we first, compute a confidence interval for log(θ)And then transform back (exponentiate) to get a confidenceinterval for θ.Confidence Interval for log(θ)log(ˆθ) ± zα/2SElog(ˆθ)SElog(ˆθ)=r1n11+1n12+1n21+1n22The 2 Ă— 2 table is given byn11n12n21n22Steps for the Confidence Interval for θ, odds ratio1Calculate log(ˆθ)2Construct a confidence interval for log(θ) using the formulalog(ˆθ) ± zα/2SElog(ˆθ)3Exponentiate the endpoints to get a confidence interval for θCommentslog means log base eYour textbook does not use lnExample - on the boardFind a 95% confidence interval for the odds ratio of beingbitten by a vampire bat between cows in estrous and thosenot.Example - InterpretationUnder the study conditions in Costa Rica, we are 95% con?dentthat the odds that a cow in estrous is bitten by a vampire bat arebetween 34.392 and 384.536 times higher than for cows not inestrous.Part II. Chi-square testsIntroduction to Chi-square TestsThe χ2TestThe data we observe is the category of each individual,summarized by a count of individuals in each category (recallthe case-studies)The χ2test statistic is a measure of discrepency between theobserved category counts and what is expected if the nullhypothesis is true.X2=Xi∈categories(Oi− Ei)2Eiwhere the sum goes over the categories, Oiis the


View Full Document

UW-Madison STAT 371 - Chapter 10 Analysis of Categorical Data

Documents in this Course
HW 4

HW 4

4 pages

NOTES 7

NOTES 7

19 pages

Ch. 6

Ch. 6

24 pages

Ch. 4

Ch. 4

10 pages

Ch. 3

Ch. 3

20 pages

Ch. 2

Ch. 2

28 pages

Ch. 1

Ch. 1

24 pages

Ch. 20

Ch. 20

26 pages

Ch. 19

Ch. 19

18 pages

Ch. 18

Ch. 18

26 pages

Ch. 17

Ch. 17

44 pages

Ch. 16

Ch. 16

38 pages

Ch. 15

Ch. 15

34 pages

Ch. 14

Ch. 14

16 pages

Ch. 13

Ch. 13

16 pages

Ch. 12

Ch. 12

38 pages

Ch. 11

Ch. 11

28 pages

Ch. 10

Ch. 10

40 pages

Ch. 9

Ch. 9

20 pages

Ch. 8

Ch. 8

26 pages

Ch. 7

Ch. 7

26 pages

Load more
Download Chapter 10 Analysis of Categorical Data
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 10 Analysis of Categorical Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 10 Analysis of Categorical Data 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?