9/16/09 1 Not in FPP Exploratory data analysis with two qualitative variables Exploratory data analysis with two qualitative variables Main tools Contigency tables Conditional, marginal, and joint frequencies9/16/09 2 Motivating example Surviving the Titanic Was there a class discrimination in survival of the wreck of the Titanic? “It has been suggested before the Enquiry that the third-class passengers had been unfairly treated, that their access to the boat deck had been impeded; and that when they reached the deck the first and second-class passengers were given precedence in getting places in the boats.” Lord Mersey, 1912 Titanic: Class by survival9/16/09 3 Titanic: Marginal frequencies % Dead = 1513/2224 = 0.68 % Alive = 711/2224 = 0.32 % in first class = 325/2224 = 0.14 % in second class = 285/2224 = 0.13 % in third class = 706/2224 = 0.32 % crew = 908/2224 = 0.41 Titanic: Conditional frequenceis % (Alive | 1st) = 203/325 = 0.625 % (Alive | 2nd) = 118/285 = 0.414 % (Alive | 3rd) = 178/706 = 0.252 % (Alive | Crew) = 212/908 = 0.233 Based on these frequencies does there appear to be class discrimination?9/16/09 4 Titanic: Class by person type 1st Class 2nd Class 3rd Class Crew Child. 6 24 79 0 109 Wom. 144 93 165 23 425 Men 175 168 462 885 1690 325 285 706 908 2224 Titanic: percentage of men in each class % (Man | 1st) = 175/325 = 0.54 % (Man | 2nd) = 168/285 = 0.59 % (Man | 3rd) = 462/706 = 0.65 % (Man | Crew) = 885/908 = 0.97 There are larger percentages of men in third class and crew9/16/09 5 Surviving the Titanic A reason for class differences in survival: Larger percentages of men died 3rd class consisted of mostly men. Hence, a larger percentage of 3rd class passengers died. Be alert for effects of other variables when considering relationships Relative risk and odds ratios Motivating example Physicians’ health study (1989): randomized experiment with 22071 male physicians at least 40 years old Half the subjects assigned to take aspirin every other day Other half assigned to take a placebo, a dummy pill that looked and tasted like aspirin9/16/09 6 Physicians’ health study Here are the number of people in each cell: Heart attack No heart attack Aspirin 104 10933 Placebo 189 10845 Relative risk y1 y2 x1 a b x2 c d Risk of y1 for level x1=a/(a+b) Risk of y1 for level x2=c/(c+d) € Relative risk =a/(a +b)c /(c + d)9/16/09 7 Relative risk for physicians’ health study Relative risk of a heart attack when taking aspirin versus when taking a placebo equals People that take aspirin are 0.55 times less likely to have a heart attack € RR =104 /(104 + 10933)189 /(189 + 10845)= 0.55Odds ratios y1 y2 x1 a b x2 c d Odds of y1 for level x1=a/b Odds of y1 for level x2=c/d € Odds ratio =a/bc / d9/16/09 8 Odds ratios for physicians’ health study Relative risk of a heart attack when taking aspirin versus taking a placebo is Odds of having a heart attack when taking aspirin over odds of a heart attack when taking a placebo (odds ratio) € RR =104 /(104 + 10933)189 /(189 + 10845)= 0.55€ OR =104 /10933189 /10845= 0.546Interpreting odds ratios and relative risks When the variables X and Y are independent odds ratio = 1 relative risk = 1 When subjects with level x1 are more likely to have y1 than subjects with level x2, the odds ratio > 1 relative risk > 1 When subjects with level x1 are less likely to have y1 than subjects with level x2, then odds ratio < 1 relative risk < 19/16/09 9 Odds in the news William Safire, February 7, 2002 Odds against being Democratic candidate for presidency in 2004 Gore 2:1 Lieberman 5:1 Daschle 4:1 Gephardt 15:1 Biden 5:1 Edwards 9:1 Kerry 4:1 Leahy 6:1 Dodd 4:1 Feingold 8:1 Relative risk vs absolute risk % smokers who get lung cancer: 8% (conservative guess here) Relative risk of lung cancer for smokers: 800% Getting lung cancer is not commonplace, even for smokers. But, smokers’ chances of getting lung cancer are much, much higher than non-smokers’ chances.9/16/09 10 Simpsons paradox When a third variable seemingly reverses the association between two other variables Hot hand
View Full Document