DOC PREVIEW
CMU STA 36402-36608 - Handout

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

4/6/2010 36-402/608 ADA-II H. SeltmanBreakout #20 CommentsThese data come from The Sleuth, chapters 18 and 19.# Randomized trial of vitamin C for preventing coldsvit = matrix(c(335,302,76,105), nrow=2, dimnames=list(c("Placebo","Vitamin C"), c("Cold", "No Cold")))source("http://www.stat.cmu.edu/~hseltman/files/cta.R")cta(vit)# $table# Cold No Cold n phat SE CIlo CIhi# Placebo 335 76 411 0.8150852 0.01914990 0.7775514 0.8526190# Vitamin C 302 105 407 0.7420147 0.02168735 0.6995075 0.7845219# Total 637 181 818 0.7787286 0.02902781 0.7218341 0.8356231## $binDiff# diff SEdiff Z p.value CIlo CIhi# -0.07307042 0.02902781 -2.51725577 0.01155033 -0.01636372 -0.12977711## $OR# OR ORlo ORih p.value# 1.532546 1.097770 2.139517 0.01214262## $miscTests# p.chisq p.Fisher# 0.01497328 0.01444212Question 1: Explain all of the numbers, including null hypotheses for the tests.Also, when is the Total CI useful? The first two lines of the “table” section givesestimates of p and the 95% CI for those estimates separately (not assuming equality).The Total line gives the pooled estimates which should be used if and only if we retainthe null hypothesis of equal probabilities.The “binDiff” section tests the difference of (independent) binomial proportions and givesthe CI for the difference. We are 95% confident that the probability of a cold is 1.6 to13.0 % lower for vitamin C than for Placebo. The best estimate of 7.3% fewer colds seemslike a fairly small effect.The OR of 1.53 represent the effect in a different way: the ratio of cold years to non-coldyears you might experience with controls is 4.4:1 and with vitamin C is 2.9:1, and theratio of these odds is 1.5. This fact that the estimated odds of getting a cold are 1.5 timesas large for control than vitamin C is often loosely and inappropriately expressed as “youare 1.5 times as likely to get a cold when not taking vitamin C”.The Z-test for OR=1 (p=0.012), the chi-square test for independence (p=0.015) and theFisher test (p=0.014) are similar. They can disagree moderately for small samples, and itis NOT clear that any one is superior (unless the sampling scheme really does fix BOTHmargins, in which case Fisher is better for small sample sizes).# Retrospective Study of Lung Cancer and Smoking# Subjects chosen to study: 86 lung cancer patients and 86 controls.ca = matrix(c(83,3,72,14), nrow=2, dimnames=list(c("Smoker","Nonsmoker"), c("Cancer", "Control")))cta(ca)# Cancer Control n phat SE CIlo CIhi# Smoker 83 72 155 0.5354839 0.04005971 0.456966849 0.6140009# Nonsmoker 3 14 17 0.1764706 0.09245944 -0.004749916 0.3576911# Total 86 86 172 0.5000000 0.12774500 0.249619796 0.7503802# diff SEdiff Z p.value CIlo# -0.3590132827 0.1277450022 -2.8103900477 0.0003667988 -0.1615144375# CIhi# -0.5565121280# OR ORlo ORih p.value# 5.379630 1.486341 19.470912 0.01035070cta(t(ca))# Smoker Nonsmoker n phat SE CIlo CIhi# Cancer 83 3 86 0.9651163 0.01978573 0.9263363 1.0038963# Control 72 14 86 0.8372093 0.03980912 0.7591834 0.9152352# Total 155 17 172 0.9011628 0.04551218 0.8119589 0.9903667# diff SEdiff Z p.value CIlo CIhi# -0.127906977 0.045512180 -2.810390048 0.004011857 -0.040775308 -0.215038646# OR ORlo ORih p.value# 5.379630 1.486341 19.470912 0.01035070Question 2: What do you conclude about smoking and lung cancer. Whatdo you conclude about selection of outcome vs. explanatory variable in thissetting?Smoking is associate with lung cancer, with an estimated odds ratio of getting cancer of5.4 (95% CI =[1.5,19.5]) comparing smokers to non-smokers. Causality is not possible inthis type of study. The p-value for H0: OR = 1 is 0.010.The OR is the same regardless of what we consider explanatory vs. outcome. Theprobabilities differ, and are not used in analysis of retrospective data.2cta(cbind(Cancer=ca[,1], Control=2*ca[,2]))# Cancer Control n phat SE CIlo CIhi# Smoker 83 144 227 0.3656388 0.03196550 0.302986386 0.4282911# Nonsmoker 3 28 31 0.0967742 0.05310032 -0.007302425 0.2008508# Total 86 172 258 0.3333333 0.09026301 0.156417830 0.5102488# diff SEdiff Z p.value CIlo CIhi# -2.68865e-01 9.02630e-02 -2.97868e+00 1.43804e-05 -1.47385e-01 -3.90344e-01# OR ORlo ORih p.value# 5.379630 1.586736 18.238959 0.006910168Question 3: What are the observed pitfalls of retrospective research? Just bychanging the completely arbitrary choice of how many people to study in each group,the estimation of “cancer rates difference” changes from 36% lower to 27% lower. Thisestimate is totally dependent on an arbitrary study design choice (in retrospective studies)so it cannot be studied with this design. Only the OR is meaningful.This study (McCleskey vs. Zant) compares death penalty rates for black defendants inGeorgia in the 1980s for 6 different (ordered) aggravation severity levels. The goal is totest whether the death penalty is applied differently depending on the race of the personkilled.dp = array(c(2,1,60,181, 2,1,15,21, 6,2,7,9, 9,2,3,4, 9,4,0,3, 17,4,0,0),dim=c(2,2,6),dimnames=list(victim=c("White","Black"),DeathPen=c("Yes","No"), aggravation=1:6))dp# , , aggravation = 1 , , aggravation = 2# DeathPen DeathPen# victim Yes No victim Yes No# White 2 60 White 2 15# Black 1 181 Black 1 21# , , aggravation = 3 , , aggravation = 4# DeathPen DeathPen# victim Yes No victim Yes No# White 6 7 White 9 3# Black 2 9 Black 2 4# , , aggravation = 5 , , aggravation = 6# DeathPen DeathPen# victim Yes No victim Yes No# White 9 0 White 17 0# Black 4 3 Black 4 0# Original data (collapsed over aggravation rather than incorporating it):3cta(cbind(Yes=c(sum(dp[1,1,]),sum(dp[2,1,])),No=c(sum(dp[1,2,]),sum(dp[2,2,])))# Yes No n phat SE CIlo CIhi# Group1 45 85 130 0.34615385 0.04172542 0.26437203 0.42793566# Group2 14 218 232 0.06034483 0.01563365 0.02970288 0.09098677# Total 59 303 362 0.16298343 0.04046480 0.08367242 0.24229443# diff SEdiff Z p.value CIlo CIhi# -2.85809e-01 4.04648e-02 -7.06315e+00 1.41467e-10 -1.98475e-01 -3.73143e-01# OR ORlo ORih p.value# 8.243697e+00 4.303302e+00 1.579219e+01 2.015553e-10# p.chisq p.Fisher# 4.683839e-12 5.090836e-12Question 4: Ignoring aggravation level, what is the conclusion? How mightthis be misleading?With a tiny p-value (<1e-11), we reject the null hypothesis that getting the death penaltyis independent of the victim’s race (for black defendants in Georgia in the 1980s). Ifwhites are more often killed under aggravated circumstances (e.g., in the commission ofa robbery), then this aggravation could be confounded with victim’s race, and could


View Full Document

CMU STA 36402-36608 - Handout

Documents in this Course
Load more
Download Handout
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Handout and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Handout 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?