Unformatted text preview:

252chisql 10 04 07 Open this document in Outline view E CHI SQUARED AND RELATED TESTS These tests are generalizations of the one sample and two sample tests of proportions A test of Goodness of Fit is necessary when a single sample is to be divided into more than two categories A Test of Homogeneity is needed when one wants to compare more than one sample A test of Independence is used to see if two variables or categorizations are related but is formally identical to a test of homogeneity 1 Tests of Homogeneity and Independence Two possible null hypotheses apply here The observed data is indicated by O the expected data by E H 0 Cities are Homogeneous by income groups O City 1 Upper Income Middle Income Lower Income Sample Size 30 pc 2 3 4 Total 50 pr 1 150 3 10 15 15 10 50 5 10 15 10 40 40 15 15 10 20 60 60 30 40 40 40 40 40 1 40 4 4 4 150 5 150 15 150 15 150 15 4 150 150 1 150 2 1 H 0 Sick Days are Independent of Age O Days Age 15 25 Age 26 49 Age 50 up Total 30 pc 0 3 1 2 10 15 15 10 50 pr 50 1 150 3 5 10 15 10 40 40 15 15 10 20 60 60 30 40 40 40 40 40 1 40 4 4 4 150 5 150 15 150 15 150 15 Total 150 1 The numbers are obviously identical in these two cases In each case the expected values are done the same way There are r rows c columns and rc n cells Each cell gets p c p r n p c Column total For example for the upper left corner the expected value is 1 1 150 1 30 10 3 5 3 150 4 150 1 15 2 15 5 5 252chisql 10 04 07 Open this document in Outline view E Column 1 2 3 4 Total 50 Row 10 13 1 3 13 1 3 13 13 50 1 Row 8 10 2 3 10 2 3 10 2 3 40 40 16 16 16 60 60 2 Row 3 12 Total 30 40 40 40 30 40 40 1 40 4 4 4 150 5 150 15 150 15 150 15 pc O E 2 pr 1 150 3 150 150 150 1 1 O2 n The first of these E E two formulas is shown below For an explanation of the equivalence of these two formulas the reason why the degrees of freedom are as given below and to relate the chi squared test to a z test of proportions see 252chisqnote The formula for the chi squared statistic is 2 O E E O E O 2 or 2 E O 2 E 10 0000 10 0 0000 0 0000 0 00000 8 0000 5 3 0000 9 0000 1 12500 12 0000 15 3 0000 9 0000 0 75000 13 3333 15 1 6667 2 7778 0 20833 10 6667 10 0 6667 0 4445 0 04167 16 0000 15 1 0000 1 0000 0 06250 13 3333 15 1 6667 2 7779 0 20834 10 6667 15 4 3333 18 7775 1 76038 16 0000 10 6 0000 36 0000 2 25000 13 3333 10 3 3333 11 1109 0 83332 10 6667 10 0 6667 0 4445 0 04167 16 0000 20 4 0000 16 0000 1 00000 150 0000 150 0 0000 8 28121 The degrees of freedom for this application are r 1 c 1 3 1 4 1 2 3 6 The most common test is a one tailed test on the grounds that the larger the discrepancy that occurs between O and O E E 2 E O 2 If our significance level is 5 compare E the larger will be E to 205 6 12 5916 chiSq Since our value of this sum is less than the table chi squared do not reject the null hypothesis 4 15 2 5 252chisql 10 04 07 Open this document in Outline view Note Rule of thumb for E All values of E should be above 5 and we generally combine cells to make this so However a number 2 is acceptable in E if i Our computed 2 turns out to be less than 2 or ii The particular value of E makes a very small contribution to O E 2 E relative to the value of the total 252chisql 10 04 07 Open this document in Outline view Note Marascuilo Procedure The Marascuilo procedure says that for 2 by c tests if i equality is rejected and ii p a p b 2 s p where a and b represent 2 groups the chi squared has c 1 degrees of freedom and the standard deviation is s p p a q a pb qb you can say that you have a significant na nb difference between p a and p b This is equivalent to using a confidence interval of c 1 pa qa pq b b nb na pa pb pa pb 2 Example Pelosi and Sandifer give the data below for satisfaction with local phone service classified by type of company providing the service 1 Is the proportion of people who rate the phone service as excellent independent of the type of company 2 If it is not test for a difference in the proportion who rate their service as excellent against the best rated provider Remember that this is a test of equality of proportions Use a confidence level of 95 Service 1 is a long distance company Service 2 is a local phone company Service 3 is a power company Service 4 is CATV cable and Service 5 is Cellular p1 is thus the proportion of long distance company customers that rate their service as excellent H 0 p1 p 2 p 3 p 4 p 5 H 1 Not all ps equal p1 1592 p 2 2520 p 3 2127 p 4 3328 p 5 2571 n1 1658 n 2 1762 n 3 616 n 4 646 n 5 770 Solution Set up the O table To get the number that rate service as excellent for long distance note that p1 n1 1592 1658 263 95 But this must be a whole number so round it to 246 The number that do not rate it as excellent is 1658 246 1394 This gives us our first column O Long Dist Local Ph Power CATV Cellular Total pr Excellent 264 444 131 215 198 1252 2296 Not 1394 1318 485 431 572 4200 7704 Sum 1658 1762 616 646 770 5452 1 0000 Proportion 1592 2520 2127 3328 2571 Excellent 0000807 0001070 0002718 0003437 0002481 pq n Note that in addition to computing the overall proportion of excellent and not excellent service 2296 and 7704 the proportion excellent has been computed for each type of service as well as the variance pa qa used in the Marascuilo confidence interval formula If we apply the proportions in each …


View Full Document

WCU ECO 252 - E. Chi-Squared and Related Tests

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view E. Chi-Squared and Related Tests and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view E. Chi-Squared and Related Tests and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?