I. IntroductionII. The Multinomial DistributionIII. Expected Versus Observed Frequencies: The DieFaceProbabilityExpected FrequencyFaceProbabilityExpectedObservedFaceIV. Searching For an Appropriate Probability ModelMen On BaseSumMen on BaseSumMen On BaseSumV. Two By Two (2x2) Contingency TablesConsumer Knew Frost-Free Uses More ElectricityConsumerUnawareTotalsConsumer Knew Frost-Free Uses More ElectricityConsumerUnawareTotalsConsumer Knew Frost-Free Uses More ElectricityConsumerUnawareTotalsConsumer Knew Frost-Free Uses More ElectricityConsumerUnawareTotalsConsumer Knew Frost-Free Uses More ElectricityConsumerUnawareTotalsNov. 10, 2009 LEC #13 ECON 240A-1 L. PhillipsExpected Vs. Observed Frequencies, Contingency Tables & Chi SquareI. IntroductionThe Chi Square Distribution can be used to compare expected and observed distributions. There are a number of applications. One example is throwing a die. In this case we have an expected theoretical distribution for each face if the die is fair. We could conduct an experiment and roll a die six hundred times and record which face comes up for each trial and calculate the experimental frequencies. Alternatively, we could simulatesuch an experiment. Once we have the experimental frequencies for each face we can compare them to the expected frequency of 100. Unless this experiment is extraordinary, the experimental frequencies will differ somewhat from the expected frequencies. The issue is do they differ significantly? The Chi Square test is based on squaring the difference between the expected frequency and the observed frequency for each face and dividing this square by the expected frequency and summing over all six faces. This number is distributed as Chi Square with 5 degrees of freedom. The null hypothesis is that the observed frequency equals the expected frequency, in which case this statistic will be zero. Only if the statistic is significantly large would we accept the alternative hypothesis that the observeddistribution differs from the expected. We test this at the 5% level.Another example is searching for a probability model that will fit the observed frequency of the number of men on base when home runs were hit for a particular year in the National League. One possibility is to use the binomial, but a better fit is obtained from the Poisson distribution. Another application is contingency table analysis, which can be used to test for association or interdependence between variables. An example is a simple two by twoNov. 10, 2009 LEC #13 ECON 240A-2 L. PhillipsExpected Vs. Observed Frequencies, Contingency Tables & Chi Squaretable. For example, is there a connection between consumer information and purchasing behavior. This example looks at two kinds of refrigerators purchased, frost-free and not frost-free and how that varies by whether the consumer knew that frost-free refrigerators consume more electricity. We could look at each of the marginal distributions, for example what fraction purchased frost-free refrigerators and the remaining fraction that did not. We could examine which consumers were informed about electricity use and which were not. The expected cell frequency in the two by two table would be the product of these marginal frequencies if the purchase were independent of the consumer information. We could calculate the four expected cell frequencies using the product of the marginal distributions under the null hypothesis of independence and then compare these to the observed cell frequencies. If refrigerator choice is independent of consumer information, then we should get an insignificant Chi Square statistic.II. The Multinomial DistributionThe Bernoulli event with only two classes, such as yes or no, or heads versus tails,can be extended to accommodate more classes. The resulting distribution is called the multinomial. For example, rolling a fair die is an example where there are six possible elementary outcomes for one toss, {1, 2, 3, 4, 5, 6}. If the die is fair, we know the probability of each outcome, P(j), is one sixth.Consider two tosses of the die, as illustrated partially in Figure 1. We could obtain36 elementary events {1,1}, {1, 2}, {1,3}, ……..{6, 5}, {6, 6}. Using n1 to count the number of ones, etc, if the elementary outcome is {1, 1}, then n1 = 2, n2 = 0, etc. , whereNov. 10, 2009 LEC #13 ECON 240A-3 L. PhillipsExpected Vs. Observed Frequencies, Contingency Tables & Chi Square61jnj = n. If the elementary event were {1, 2}, we would have n1 =1, and n2 =1, and the rest of the nj = 0.. But we could obtain one one and one two two ways, {1, 2} and {2, 1}, so we have to count the combinations as well.--------------------------------------------------------------------Figure 1: Two Throws of a Die, Partially Illustrated. ----------------------------------------------------------------------------The probability of one one and one two is:P(n1 =1, n2 =1, n3 =0, n4 =0, n5 =0, n6 =0) = [n!/61jnj ] 61[jpj]n(j) = 2!/1!1!0!0!0!0! (1/6)1(1/6)1(1/6)0(1/6)0(1/6)0(1/6)0= 2*(1/36) = 2/36III. Expected Versus Observed Frequencies: The DieOur expectations for the probabilities of each face are listed in Table 1. If we wereto simulate this to obtain experimental frequencies for rolling a die 600 times, we might obtain the following empirical (simulated) distribution, from data file XR15-09, as listed 123456123456Nov. 10, 2009 LEC #13 ECON 240A-4 L. PhillipsExpected Vs. Observed Frequencies, Contingency Tables & Chi Squarein Table 2. The Chi-Square statistic is calculated from this comparison of observed and expected frequencies, squaring the difference and dividing by the expected frequency andsumming these values: 61j(Oj – Ej)2/Ej , which is distributed as Chi-Square with 5 degrees of freedom, one degree being lost since the probabilities sum to one, and hence only five are independent.Table 1: Expected Frequencies For Each Face of the Die in 600 Throws.Face Probability Expected Frequency1 1/6 1002 1/6 1003 1/6 1004 1/6 1005 1/6 1006 1/6 100---------------------------------------------------------------------------------------Table 2: Observed Versus Expected Frequencies, for Die FacesFace Probability Expected Observed1 1/6 100 1142 1/6 100 923 1/6 100 844 1/6 100 1015 1/6 100 1076 1/6 100 107The difference between observed and expected frequencies is reported in Table 3, along with each cell’s contribution to the Chi-Square statistic. Table 3: Simulated frequencies Compared to
View Full Document