1. Tests of Homogeneity and IndependenceNote: Rule of thumb for .Note: Marascuilo Procedure.2. Tests of Goodness of Fita. Uniform Distributionb. Poisson Distributionc. Normal Distribution3. Kolmogorov-Smirnov Testa. Kolmogorov-Smirnov One-Sample Testb. Lilliefors Test.©2002 Roger Even Bove252chisq 2/29/08 (Open this document in 'Outline' view!)E. CHI-SQUARED AND RELATED TESTS.These tests are generalizations of the one-sample and two-sample tests of proportions. A test of Goodness of Fit is necessary when a single sample is to be divided into more than two categories. A Test ofHomogeneity is needed when one wants to compare more than one sample. A test of Independence is usedto see if two variables or categorizations are related, but is formally identical to a test of homogeneity.1. Tests of Homogeneity and IndependenceTwo possible null hypotheses apply here. The observed data is indicated by ,O the expected data by E.groupsincomebysHomogeneouareCitiesH :011541504015415040154150405115030115040404030Size Sample521506015415040311505060405020101515101510510151510Income LowerIncome MiddleIncome UpperTotal4321CitycrppOAgeoftIndependenareDaysSickH :011541504015415040154150405115030115040404030Total521506015415040311505060405020101515101510510151510 up 50 Age49-26 Age25-15 AgeTotal3210DayscrppOThe numbers are obviously identical in these two cases. In each case the expected values are done the same way. There are 3r rows, 4ccolumns and 12rc cells. 150On. Each cell gets )( totalColumnpnppcrc. For example, for the upper left corner the expected value is 10303115051311252chisq 2/29/08 (Open this document in 'Outline' view!)11541504015415040154150405115030115040404030Total5215060154150403115050604050161616121010108131313103 Row2 Row1 RowTotal4321Column323232313131crppEThe formula for the chi-squared statistic is EEO22 or nEO22. The first of these two formulas is shown below. For an explanation of the equivalence of these two formulas, the reason whythe degrees of freedom are as given below, and to relate the chi-squared test to a ztest of proportions, see252chisqnote. E O OE 2OE EOE2 10.0000 10 0.0000 0.0000 0.00000 8.0000 5 3.0000 9.0000 1.12500 12.0000 15 -3.0000 9.0000 0.75000 13.3333 15 -1.6667 2.7778 0.20833 10.6667 10 0.6667 0.4445 0.04167 16.0000 15 1.0000 1.0000 0.06250 13.3333 15 -1.6667 2.7779 0.20834 10.6667 15 -4.3333 18.7775 1.76038 16.0000 10 6.0000 36.0000 2.25000 13.3333 10 3.3333 11.1109 0.83332 10.6667 10 0.6667 0.4445 0.04167 16.0000 20 -4.0000 16.0000 1.00000150.0000 150 0.0000 8.28121The degrees of freedom for this application are 632141311 cr.The most common test is a one-tailed test on the grounds that the larger the discrepancy that occurs between O and E , the larger will be EOE2. If our significance level is 5%, compare EEO2 to .5916.126205.Since our value of this sum is less than the table chi-squared, do not reject the null hypothesis.2252chisq 2/29/08 (Open this document in 'Outline' view!)Note: Rule of thumb for E. All values of E should be above 5 and we generally combine cells to make this so. However a number2 is acceptable in E if i) Our computed 2 turns out to be less than 2 or, (ii) The particular value of E makes a very small contribution to EEO2, relative to the value of the total. Note: Marascuilo Procedure.The Marascuilo procedure says that, for 2 by c tests, if (i) equality is rejected and (ii) pbaspp2 , where a and brepresent 2 groups, the chi - squared has 1c degrees of freedom and the standard deviation is bbbaaapnqpnqps , you can say that you have a significantdifference between ap and bp. This is equivalent to using a confidence interval of bbbaaacbabanqpnqppppp12Example: Pelosi and Sandifer give the data below for satisfaction with local phone service classified by type of company providing the service. 1) Is the proportion of people who rate the phone service as excellent independent of the type of company? 2) If it is not, test for a difference in the proportion who rate their service as excellent against the best-rated provider. Remember that this is a test of equality of proportions. Service 1 is a long distance company, Service 2 is a local phone company, Service 3 is a power company, Service 4 is CATV (cable) and Service 5 is Cellular. 1p is thus the proportion of long distance company customers that rate their service as excellent.543210: pppppH :1HNot all ps equal.16581592.11np 17622520.22np 6162127.33np 6463328.44np 7702571.55np Solution: Set up the O table. To get the number that rate service as excellent for long distance, note that 95.26316581592.11np. But this must be a whole number, so round it to 246. The number that do not rate it as excellent is 13942461658 . This gives us our first column. nqp is also computed for use later.OLong Dist Local Ph Power CATV Cellular Total rpExcellent 264 444 131 215 198 1252 .2296Not 1394 1318 485 431 572 4200 .7704Sum 1658 1762 616 646 770 5452 1.0000ProportionExcellent.1592 .2520 .2127 .3328 .2571nqp.0000807 .0001070 .0002718 .0003437 .0002481Note that in addition to computing the overall proportion of excellent and not excellent service (.2296 and.7704) , the ‘proportion excellent’ has been computed for each type of service as well as the varianceaaanqp used in the confidence interval formula. If we apply the proportions in each row to the column sums we get the following expected values.ELong Dist Local Ph Power CATV Cellular Total rpExcellent 380.68 404.56 141.43 148.32 176.79 1252 . .22963252chisq 2/29/08 (Open this document in 'Outline' view!)Not 1277.32 1357.44 474.57 497.68 593.21 4200 .7704sum 1658 1762 616 646 770 5452 1.00004252chisq 2/29/08 (Open this document in 'Outline' view!)The chi-squared test follows. Row E O OE 2OE EOE2 1 380.68 264 116.677 13613.5 35.7612 2 1277.32 1394 -116.677 13613.5 10.6578 3 404.56 444 -39.445 1555.9 3.8459 4 1357.44 1318 39.445 1555.9 1.1462 5 141.43
View Full Document