Dec. 9, 2010 ECON 240A-1 L. PhillipsFinal -KeyAnswer all 5 questions. No talking or communicating.1. (30) The Association of Community Organizations for Reform Now presented data to a Joint Congressional Hearing on discrimination in lending of the 102nd Congress. The data was mortgage refusal rates from 20 banks (see DASL). There were 20 observations for moderate income whites (group 2) and 20 observations for moderate income minorities (group 1). There were also 20 observations for high income whites (group 4) and 20 observations for high income minorities (group 3). Figure 1-1 contains box plots for thesefour groups.Figure I-1: Box Plots of the Mortgage Refusal Rates for the Four GroupsTwo-way ANOVA was conducted on these 80 observations of refusal rates using two indicator variables: The first indicator variable, white, was coded one for white applicants and zero for minority applicants. The second indicator variable, high income, was coded one for highincome applicants and zero for moderate income applicants. The results from regressing refusal rates on these two indicator variables and their interaction are shown in Table 1-1.Dec. 9, 2010 ECON 240A-2 L. PhillipsFinal -KeyTable 1-1: Two-Way ANOVA of Mortgage Refusal Rates on Race and Income IndicatorsDependent Variable: REFUSALRATEMethod: Least SquaresSample: 1 80Included observations: 80Variable Coefficient Std. Error t-Statistic Prob. WHITE -21.25500 3.134077 -6.781900 0.0000HIGHINCOME -9.365000 3.134077 -2.988120 0.0038WHITE*HIGHINCOME5.040000 4.432255 1.137119 0.2591C 36.88000 2.216127 16.64164 0.0000R-squared 0.519906 Mean dependent var 22.83000Adjusted R-squared 0.500955 S.D. dependent var 14.02942S.E. of regression 9.910823 Akaike info criterion 7.473838Sum squared resid 7465.055 Schwarz criterion 7.592940Log likelihood -294.9535 F-statistic 27.43409Durbin-Watson stat 1.309229 Prob(F-statistic) 0.000000a. Is the interaction effect significant? No, t = 1.19, Prob. 0.26> 0.05The regression was re-estimated without the interaction effect and the results are shown in Table 1-2.Table 1-2: Two-Way ANOVA of Mortgage Refusal Rates on Race and Income IndicatorsDependent Variable: REFUSALRATEMethod: Least SquaresSample: 1 80Included observations: 80Variable Coefficient Std. Error t-Statistic Prob. WHITE -18.73500 2.220340 -8.437896 0.0000HIGHINCOME -6.845000 2.220340 -3.082861 0.0028C 35.62000 1.922871 18.52438 0.0000R-squared 0.511738 Mean dependent var 22.83000Adjusted R-squared 0.499056 S.D. dependent var 14.02942S.E. of regression 9.929664 Akaike info criterion 7.465709Sum squared resid 7592.063 Schwarz criterion 7.555035Log likelihood -295.6284 F-statistic 40.35106Durbin-Watson stat 1.283397 Prob(F-statistic) 0.000000b. What is the expected refusal rate for moderate income minority applicants? 35.6 %Dec. 9, 2010 ECON 240A-3 L. PhillipsFinal -Keyc. What is the expected refusal rate for moderate income white applicants? 35.6 – 18.7 = 16.9%d. What is the expected refusal rate for high income white applicants? 35.6 – 18.7 – 6.8 = 10.1%e. Are your answers to parts b, c, and, d qualitatively consistent with the Box plots? yes2. (30) The web has a number of articles about the false positive paradox with implications about application to screening for terrorists. The following is a quote from Cory Doctorow that appeared May 20, 2008 in The Guardian (http://www.guardian.co.uk/technology/2008/may/20/rare.events) He starts with a medical example. “Our innumeracy means that our fight against these super-rarities is likewise ineffective. Statisticians speak of something called the Paradox of the False Positive. Here's how that works: imagine that you've got a disease that strikes one in a million people, and a test for the disease that's 99% accurate. You administer the test to a million people, and it will be positive for around 10,000 of them – because for every hundredpeople, it will be wrong once (that's what 99% accurate means).Yet, statistically, we know that there's only one infected person in the entire sample. That means that your "99% accurate" test is wrong 9,999 times out of 10,000!”Table 2-1: I filled in the following tableau with these numbers, above, which I think captures his numerical example: Sick (S) Healthy (H)Test positive (+) 1 9,999 10,000Test negative (-) 0 990,000 990 ,0001 999,999 1,000,000a. In the following tableau, fill in the marginal and joint probabilities, i.e. the shaded cells. Table 2-2Sick (S) Healthy (H)Test positive (+) 1 in a million About 1 in a 100 1 in a 100Test negative (-) 0 99 in a 100 99 in a 1001 in a million about 1 1 Without confusing the public with statistical terms such as joint probabilities and conditional probabilities, these numerical examples make a valid point. For rare events such as a disease that strikes one in a million, most of the positive test results indicating sickness will be for healthy people.Dec. 9, 2010 ECON 240A-4 L. PhillipsFinal -KeyThe medical profession uses two descriptors for tests, sensitivity and specificity. Tests with high sensitivity are used for screening and cast a wide net. Tests with high specificity zero in on a particular disease.Sensitivity = # of true positives/(# of true positives + # of false negatives)Specificity = # of true negative/ ( # of true negatives + # false positives)For the medical example above, the sensitivity is 1 or 100% and the specificity is about 0.99 or 99%, i.e. the accuracy measure mentioned in the internet quote. These test measures are good from a medical test perspective, but because this disease is a rare event, these test measures ignore the large number of false positives.b. Using the example above, show from a decision theory perspective that the medical profession appears not to care about mis-diagnosing healthy people. There are two types of mistakes, false negatives, people who are sick but test negative and are overlooked andthere are zero of these and false positives, healthy folks who test positive and have to be further tested or examined and there are 10,000 of those. The expected cost of these mistakes, E C = C(-/S) 0 + C(+/H)10,000. This example implies that the medical profession assigns a low cost, C(+/H), to false positives, and a very high cost, C (-/S) to false negatives since mis-diagnosing a person who is sick might mean they die, worst case.c. The false positive paradox: calculate the conditional probability of being sick given
View Full Document