I. Goodness of Fit For A Variable with the Multinomial DistributionOutcomeProbabilityExpected FrequencyOutcomeObserved FrequencyExpected FrequencyOutcomeIII. Two-Way Contingency Tables in ExcelIV. Two-Way Contingency Tables in Eviews.Nov. 17, 2004 Lab #8 Econ240A-1 L. PhillipsGoodness of Fit, Chi-Square, and Contingency Table Analysis I. Goodness of Fit For A Variable with the Multinomial DistributionThis is an example from the text, Chapter 16, problem 16.7, p.550. It uses the datafile XR16-07(XR15-07, 5th Ed.). There are 100 trials with outcomes ranging from integer values of 1 through five. If each value is equally likely, the associated probability will be 1/5, and the expected number in 100 trials will be 20. This is listed in Table 1. For example, on the first trial, the probability of getting the outcome with value one is given by the multinomial:P(n1 =1, n2 =0, n3 =0, n4 =0, n5 =0) = [n!/51jnj ] 51[jpj]n(j)= 1!/1!0!0!0!0!0! (1/5)1(1/5)0(1/5)0(1/5)0(1/5)0= 1/5--------------------------------------------------------------------------------------------Table 1: Expected Frequencies For Each Outcome (Values 1-5) in 100 Trials.Outcome Probability Expected Frequency1 1/5 202 1/5 203 1/5 204 1/5 205 1/5 20---------------------------------------------------------------------------------------Open the file XR15-07 in Excel. Go to cell D1 and type outcome, and in cell E1, type count. In cells D2-D6, type sequentially 1-5 for each value of the outcome. Select cell E2, click the equal (=) sign, or type it in the formula bar, and select statistical for function class, and select countif for function. Click on the ? in the countif box and use the office assistant to read about this function. In the dialog box, type in 1 for criteria, andtype in A2:A101 for range. The box will indicate a count of 28. Go to cell E3 and repeat for outcome 2, and so on though outcome value 5. Go to cell D7 and type in sum. Go to cell E7 and click on =, and select the sum function. For number select E2:E6. You shouldget 100, the number of trials, as a check.Table 2 lists the outcome values, the expected frequencies, and the oberved frequencies recovered from this data file.--------------------------------------------------------------------------------------------Nov. 17, 2004 Lab #8 Econ240A-2 L. PhillipsGoodness of Fit, Chi-Square, and Contingency Table Analysis Table 2: Expected and Observed Frequencies For Each Outcome (Values 1-5)Outcome Observed Frequency Expected Frequency1 28 202 17 203 19 204 17 205 19 20---------------------------------------------------------------------------------------To check how close the observed (simulated) frequencies come to the expected cell counts for each outcome, take the difference between the observed cell count and the expected cell count, square this difference and divide by the expected cell count. This is the contribution of each outcome to the Chi-Square statistic, which is the sum over all outcomes: 51j(Oj – Ej)2/Ej . This process is displayed in Table 3.Table 3: Simulated frequencies Compared to TheoreticalOutcome Observed, OjExpected, EjOj - Ej(Oj – Ej) 2 /Ej1 28 20 8 64/20 = 3.202 17 20 - 3 9/20 = 0.453 19 20 - 1 1/20 = 0.054 17 20 -3 9/20 = 0.455 19 20 -1 1/20 = 0.052 = 3.20 + 0.45 + 0.05 + 0.45 + 0.05 = 4.20There are four degrees of freedom, since there are five outcomes with probability 0.20, where the sum of all five add to one, so only four are independent as the fifth probability can be found by subtracting the other four from one. This statistic, 51j(Oj – Ej)2/Ej , is distributed as Chi-Square with four degrees of freedom. In general, if you take independently distributed normal variables, subtract their mean, and divide by their standard deviation (i.e. in z or standardized form), and square and sum them, they are distributed Chi-Square. The critical value of Chi-square at the 5% level of significanceNov. 17, 2004 Lab #8 Econ240A-3 L. PhillipsGoodness of Fit, Chi-Square, and Contingency Table Analysis (the problem uses 10%), for 4 degrees of freedom, is from Table 5 in the text, p. B-10, 9.49. So there is no significant difference between the expected distribution and the observed (simulated) distribution. The Chi-Square distribution for 4 degrees of freedom is illustrated in Figure 1.-------------------------------------------------------------------------------------II. The Chi-Square Distribution in EViewsTo create such a figure, open Eviews, go to the file menu and open a new workfile. In the box, select undated for the data frequency, and a range of 1 to 100 observations, more if you want a more dense plot. In the workfile window, select the GENR command, and in the window type CHI = @rchisq(4), to generate a random variable distributed Chi-Square with 4 degrees of freedom. To get background information, go to the EViews help menu, select Eviews Help Topics:Contents Tab, and select Eviews Basics. Double click on Using Expressions 0.000.050.100.150.200 5 10 15Chi-Square VariableDENSITYFigure 1: Chi-Square Density for 4 Degrees of Freedom9.485%Nov. 17, 2004 Lab #8 Econ240A-4 L. PhillipsGoodness of Fit, Chi-Square, and Contingency Table Analysis and read. Scroll down until you get to Mathematical Operators and Functions, and click and scroll way down (about ¾ of the way) until you get to Statistical Distribution Functions, and read. Here you will find the Rosetta Stone for deciphering what we are doing with the Chi-Square. The guide to doing similar exercises with other distributions is here.To calculate the Chi-Square density, in the workfile window, select the GENR command, and in the window type density = @dchisq(chi, 4), to generate the Chi-Square density with 4 degrees of freedom for our random variable CHI distributed Chi-Square. Go to the Quick menu and select graph. In the window, type in chi density, and select scatterplot to obtain Figure 1. I added the critical value for =0.05 in Word.III. Two-Way Contingency Tables in ExcelThis next example is also from the text, Chapter 16, Excel data file XR16-25(XR15-24, 5th Ed.), problem 16.25, p. 545. This is relevant for commercial advertising on TV. Students were surveyed. One group was watching a happy program, “Real People”, and the other group, a sad “Sixty Minutes” program. They were asked what theywere thinking during the final commercial, with the responses categorized into three: (a) thinking primarily about the commercial, (b) thinking
View Full Document