DOC PREVIEW
UI STAT 2010 - Statistical Methods and Computing

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 22S 105 Statistical Methods and Computing 2 The Chi square test for differences among more than 2 proportions We are interested in the independent samples case Contingency Tables and the Chi Square Test Introduction to ANOVA Example A study investigated the accuracy of death certificates by comparing the results of 575 autopies to the causes of death listed on the certificates Lecture 21 Apr 14 2008 Kate Cowles 374 SH 335 0727 kcowles stat uiowa edu Two hospitals participated in the study community hospital labeled A university ospital labeled B Three possible cases death certificate confirmed accurate death certificate contained inaccuracies but did not require correction of underlying cause of death 3 death certificate incorrect and required recoding of underlying cause of death Question of interest Are there differences between the two hospitals with respect to practices in completing death certificates One way to address the question Test null hypothesis that within each category of death certificate status the proportions of death certificates coming from Hosptial A are the same 4 Another multiple comparisons problem H0 pc pi pr Ha pc 6 pi or pc 6 pr or pi 6 pr We will first test whether there are any significant differences Only if we reject H0 in the overall test will we do pairwise tests to find out which population proportions are different 5 Results Confirmed accurate Inacc no recoding Incorrect recoding Total 6 Observed and expected counts Hospital A Hospital B Total 157 268 425 18 44 62 54 34 88 229 346 575 The overall sample proportion of death certificates from hospital A is 229 0 398 575 Hospital A Hospital B Hospital A Hospital B Accurate 157 268 169 3 255 7 Incorrect 18 44 24 7 37 3 Recode 54 34 35 0 53 0 The Chi square statistic is X 2 21 62 r 3 rows c 2 columns If H0 is true we would expect this same proportion of hospital A certificates in all three categories 7 So the degrees of freedom is r 1 c 1 2 1 2 8 According to Table E the 05 cutoff under a Chi square distribution with 2 d f is 5 99 This Chi square test in SAS We can reject H0 because 21 62 5 99 The p value 0 001 options linesize 72 We conclude that the proportions of death certificates from Hospital A are not the same for the three different categories of certificate status data dthcert input hosp status count datalines A C 157 A I 18 A R 54 B C 268 B I 44 B R 34 proc freq data dthcert tables status hosp expected weight count run proc freq data dthcert tables status hosp chisq weight count run 9 10 TABLE OF STATUS BY HOSP STATUS Total 346 60 17 575 100 00 HOSP Frequency Expected Percent Row Pct Col Pct A B C 157 268 169 26 255 74 27 30 46 61 36 94 63 06 68 56 77 46 I 18 44 24 692 37 308 3 13 7 65 29 03 70 97 7 86 12 72 R 54 34 35 047 52 953 9 39 5 91 61 36 38 64 23 58 9 83 Total 425 73 91 62 10 78 88 15 30 11 TABLE OF STATUS BY HOSP STATUS 229 39 83 STATISTICS FOR TABLE OF STATUS BY HOSP HOSP Frequency Percent Row Pct Col Pct A B C 157 268 27 30 46 61 36 94 63 06 68 56 77 46 I 18 44 3 13 7 65 29 03 70 97 7 86 12 72 R 54 34 9 39 5 91 61 36 38 64 23 58 9 83 Total 229 346 39 83 60 17 12 Total 425 73 91 Statistic DF Value Prob Chi Square 2 21 523 0 001 Likelihood Ratio Chi Square 2 21 189 0 001 Mantel Haenszel Chi Square 1 12 864 0 001 Phi Coefficient 0 193 Contingency Coefficient 0 190 Cramer s V 0 193 Sample Size 575 62 10 78 88 15 30 575 100 00 13 The sample proportions are Hospital A Hospital B Confirmed accurate 157 268 Inacc no recoding 18 44 Incorrect recoding 54 34 Total 229 346 Total 0 369 0 409 0 614 575 14 Comparing more than two population means Example Does the presence of pets or friends affect responses to stress Allen Blascovich Tomaka and Kelsey 1988 Journal of Personality and Social Psychology More advanced methods provide tests and confidence intervals to formalize analysis of which population proportions are significantly different subjects 45 women who described themselves as dog lovers randomly assigned to three groups to do a stressful task 1 alone 2 with a good friend present 3 with their dog present Subjects mean heart rate during the task was one measure of the effect of stress 15 Goal to compare population means under three different treatments 16 SAS descriptive statistics Analysis Variable BEATS a three independent sample problem GROUP C Call the population mean heart rates 1 for when pets are present 2 for when friends are present and 3 for when women perform task alone then N Mean Std Dev Minimum Maximum 15 82 5240667 9 2415747 62 6460000 99 0460000 H 0 1 2 3 Ha 1 6 2 or 1 6 3 or 2 6 3 not one sided or 2 sided GROUP F N Mean Std Dev Minimum Maximum 15 91 3251333 8 3411341 76 9080000 102 1540000 GROUP P N Mean Std Dev Minimum Maximum 15 73 4830667 9 9698202 58 6920000 97 5380000 17 To infer about the three population means we might use the two independent sample t test 3 times Test H0 1 2 to see if mean heart rate when pet is present differs from mean when friend is present Test H0 1 3 to see if mean heart rate when pet is present differs from mean when alone 18 Problem with this approach 3 p values for 3 different tests don t tell us how likely it is that three sample means are spread apart as far as these are might be that x 1 73 48 and x 2 91 32 are significantly different if we look at just 2 groups but not significantly different if we know they are the smallest and largest means in 3 groups As more and more groups are considered we expect gap between smallest and largest sample mean to get larger Imagine comparing heights of shortest and tallest person in larger and larger groups of people Test H0 2 3 to see if mean heart rate when friend is present differs from mean when alone the probability of Type I error for the whole set of t tests will be much bigger than the level set for each one 19 Multiple comparisons procedures in statistics issue how to do many comparisons at once with some overall measure of confidence in all our conclusions two steps overall test of whether there is good evidence of any differences among parameters we wish to compare follow up analysis to decide which of parameters differ …


View Full Document

UI STAT 2010 - Statistical Methods and Computing

Documents in this Course
Load more
Download Statistical Methods and Computing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Statistical Methods and Computing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Statistical Methods and Computing and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?