Unformatted text preview:

Table of contentsOutlineSimpson's paradoxBerkeley dataConfoundingWeightingMantel/Haenszel estimatorLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorLecture 23Brian CaffoDepartment of BiostatisticsJohns Hopkins Bloomberg School of Public HealthJohns Hopkins UniversityNovember 15, 2007Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorTable of contents1 Table of contents2 Outline3 Simpson’s paradox4 Berkeley data5 Confounding6 Weighting7 Mantel/Haenszel estimatorLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/Haenszelestimator1 Simpson’s paradox2 Weighting3 CMH estimate4 CMH testLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorSimpson’s (perceived) paradoxDeath penaltyVictim Defendant yes no % yesWhite White 53 414 11.3Black 11 37 22.9Black White 0 16 0.0Black 4 139 2.8White 53 430 11.0Black 15 176 7.9White 64 451 12.4Black 4 155 2.511From Agresti, Categorical Data Analysis, second editionLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorDiscussion•Marginally, white defendants received the death penalty agreater percentage of time than black defendants•Across white and black victims, black defendant’s receivedthe death penalty a greater percentage of time than whitedefendants•Simpson’s paradox refers to the fact that marginal andconditional associations can be opposing•The death penalty was enacted more often for the murderof a white victim than a black victim. Whites tend to killwhites, hence the larger marginal association.Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorExample•Wikipedia’s entry on Simpson’s paradox gives an examplecomparing two player’s batting averagesFirst Second WholeHalf Half SeasonPlayer 1 4/10 (.40) 25/100 (.25) 29/110 (.26)Plater 2 35/100 (.35) 2/10 (.20) 37/110 (.34)•Player 1 has a better batting average than Player 2 inboth the first and second half of the season, yet has aworse batting average overall•Consider the number of at-batsLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorBerkeley admissions data•The Berkeley admissions data is a well known data setregarding Simpsons paradox?UCBAdmissionsdata(UCBAdmissions)apply(UCBAdmissions, c(1, 2), sum)GenderAdmit Male FemaleAdmitted 1198 557Rejected 1493 1278.445 .304 <- Acceptance rateLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorAcceptance rate by department> apply(UCBAdmissions, 3,function(x) c(x[1] / sum(x[1 : 2]),x[3] / sum(x[3 : 4])))Dept M FA 0.62 0.82B 0.63 0.68C 0.37 0.34D 0.33 0.35E 0.28 0.24F 0.06 0.07Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorWhy? The application rates by department> apply(UCBAdmissions, c(2, 3), sum)DeptGender A B C D E FMale 825 560 325 417 191 373Female 108 25 593 375 393 341Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorDiscussion•Mathematically, Simpson’s pardox is not paradoxicala/b < c/de/f < g/h(a + e)/(b + f ) > (c + g)/(d + h)•More statistically, it says that the apparent relationshipbetween two variables can change in the light or absenceof a thirdLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorConfounding•Variables that are correlated with both the explanatoryand response variables can distort the estimated effect•Victim’s race was correlated with defendant’s race anddeath penalty•One strategy to adjust for confounding variables is tostratify by the confounder and then combine thestrata-specific estimates•Requires appropriately weighting the strata-specificestimates•Unnecessary stratification reduces precisionLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorAside: weighting•Suppose that you have two unbiased scales, one withvariance 1 lb and and one with variance 9 lbs•Confronted with weights from both scales, would you giveboth measurements equal creedance?•Suppose that X1∼ N(µ, σ21) and X2∼ N(µ, σ22) where σ1and σ2are both known•log-likelihood for µ−(x1− µ)2/2σ21− (x2− µ)2/2σ22Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorContinued•Derivative wrt µ set equal to 0(x1− µ)/σ21+ (x2− µ)/σ22= 0•Answerx1r1+ x2r2r1+ r2= x1p + x2(1 − p)where ri= 1/σ2iand p = r1/(r1+ r2)•Note, if X1has very low variance, its term dominates theestimate of µ•General principle: instead of averaging over severalunbiased estimates, take an average weighted according toinverse variances•For our example σ21= 1, σ22= 9 so p = .9Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorMantel/Haenszel estimator•Let nijkbe entry i, j of table k•The kthsample odds ratio isˆθk=n11kn22kn12kn21k•The Mantel Haenszel estimator is of the formˆθ =PkrkˆθkPkrk•The weights are rk=n12kn21kn++k•The estimator simplifies toˆθMH=Pkn11kn22k/n++kPkn12kn21k/n++k•SE of the log is given in Agresti (page 235) or Rosner(page 656)Lecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorCenter1 2 3 4 5 6 7 8S F S F S F S F S F S F S F S FT 11 25 16 4 14 5 2 14 6 11 1 10 1 4 4 2C 10 27 22 10 7 12 1 16 0 12 0 10 1 8 6 1n 73 52 38 33 29 21 14 13S - Success, F - failureT - Active Drug, C - placebo2ˆθMH=(11 × 27)/73 + (16 × 10)/25 + . . . + (4 × 1)/13(10 × 25)/73 + (4 × 22)/25 + . . . + (6 × 2)/13)= 2.13Also logˆθMH= .758 andˆSElogˆθMH= .3032Data from Agresti, Categorical Data Analysis, second editionLecture 23Brian CaffoTable ofcontentsOutlineSimpson’sparadoxBerkeley dataConfoundingWeightingMantel/HaenszelestimatorCMH test•H0: θ1= . . . = θk= 1 versus Ha: θ1= . . . = θk6= 1•The CHM test applies to other alternatives, but is mostpowerful for the Hagiven above•Same as


View Full Document

Bloomberg School BIO 651 - lecture 23

Documents in this Course
Load more
Download lecture 23
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lecture 23 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lecture 23 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?