Dec 9 2010 ECON 240A 1 Final Key L Phillips Answer all 5 questions No talking or communicating 1 30 The Association of Community Organizations for Reform Now presented data to a Joint Congressional Hearing on discrimination in lending of the 102nd Congress The data was mortgage refusal rates from 20 banks see DASL There were 20 observations for moderate income whites group 2 and 20 observations for moderate income minorities group 1 There were also 20 observations for high income whites group 4 and 20 observations for high income minorities group 3 Figure 1 1 contains box plots for these four groups Figure I 1 Box Plots of the Mortgage Refusal Rates for the Four Groups Two way ANOVA was conducted on these 80 observations of refusal rates using two indicator variables The first indicator variable white was coded one for white applicants and zero for minority applicants The second indicator variable high income was coded one for high income applicants and zero for moderate income applicants The results from regressing refusal rates on these two indicator variables and their interaction are shown in Table 1 1 Dec 9 2010 ECON 240A 2 Final Key L Phillips Table 1 1 Two Way ANOVA of Mortgage Refusal Rates on Race and Income Indicators Dependent Variable REFUSALRATE Method Least Squares Sample 1 80 Included observations 80 Variable Coefficient Std Error t Statistic Prob WHITE HIGHINCOME WHITE HIGHINCOM E C 21 25500 9 365000 5 040000 3 134077 3 134077 4 432255 6 781900 2 988120 1 137119 0 0000 0 0038 0 2591 36 88000 2 216127 16 64164 0 0000 R squared Adjusted R squared S E of regression Sum squared resid Log likelihood Durbin Watson stat 0 519906 0 500955 9 910823 7465 055 294 9535 1 309229 Mean dependent var S D dependent var Akaike info criterion Schwarz criterion F statistic Prob F statistic 22 83000 14 02942 7 473838 7 592940 27 43409 0 000000 a Is the interaction effect significant No t 1 19 Prob 0 26 0 05 The regression was re estimated without the interaction effect and the results are shown in Table 1 2 Table 1 2 Two Way ANOVA of Mortgage Refusal Rates on Race and Income Indicators Dependent Variable REFUSALRATE Method Least Squares Sample 1 80 Included observations 80 Variable Coefficient Std Error t Statistic Prob WHITE HIGHINCOME C 18 73500 6 845000 35 62000 2 220340 2 220340 1 922871 8 437896 3 082861 18 52438 0 0000 0 0028 0 0000 R squared Adjusted R squared S E of regression Sum squared resid Log likelihood Durbin Watson stat 0 511738 0 499056 9 929664 7592 063 295 6284 1 283397 Mean dependent var S D dependent var Akaike info criterion Schwarz criterion F statistic Prob F statistic 22 83000 14 02942 7 465709 7 555035 40 35106 0 000000 b What is the expected refusal rate for moderate income minority applicants 35 6 Dec 9 2010 ECON 240A 3 Final Key L Phillips c What is the expected refusal rate for moderate income white applicants 35 6 18 7 16 9 d What is the expected refusal rate for high income white applicants 35 6 18 7 6 8 10 1 e Are your answers to parts b c and d qualitatively consistent with the Box plots yes 2 30 The web has a number of articles about the false positive paradox with implications about application to screening for terrorists The following is a quote from Cory Doctorow that appeared May 20 2008 in The Guardian http www guardian co uk technology 2008 may 20 rare events He starts with a medical example Our innumeracy means that our fight against these superrarities is likewise ineffective Statisticians speak of something called the Paradox of the False Positive Here s how that works imagine that you ve got a disease that strikes one in a million people and a test for the disease that s 99 accurate You administer the test to a million people and it will be positive for around 10 000 of them because for every hundred people it will be wrong once that s what 99 accurate means Yet statistically we know that there s only one infected person in the entire sample That means that your 99 accurate test is wrong 9 999 times out of 10 000 Table 2 1 I filled in the following tableau with these numbers above which I think captures his numerical example Sick S Healthy H Test positive 1 9 999 10 000 Test negative 0 990 000 990 000 1 999 999 1 000 000 a In the following tableau fill in the marginal and joint probabilities i e the shaded cells Table 2 2 Sick S Healthy H Test positive 1 in a million About 1 in a 100 1 in a 100 Test negative 0 99 in a 100 99 in a 100 1 in a million about 1 1 Without confusing the public with statistical terms such as joint probabilities and conditional probabilities these numerical examples make a valid point For rare events such as a disease that strikes one in a million most of the positive test results indicating sickness will be for healthy people Dec 9 2010 ECON 240A 4 Final Key L Phillips The medical profession uses two descriptors for tests sensitivity and specificity Tests with high sensitivity are used for screening and cast a wide net Tests with high specificity zero in on a particular disease Sensitivity of true positives of true positives of false negatives Specificity of true negative of true negatives false positives For the medical example above the sensitivity is 1 or 100 and the specificity is about 0 99 or 99 i e the accuracy measure mentioned in the internet quote These test measures are good from a medical test perspective but because this disease is a rare event these test measures ignore the large number of false positives b Using the example above show from a decision theory perspective that the medical profession appears not to care about mis diagnosing healthy people There are two types of mistakes false negatives people who are sick but test negative and are overlooked and there are zero of these and false positives healthy folks who test positive and have to be further tested or examined and there are 10 000 of those The expected cost of these mistakes E C C S 0 C H 10 000 This example implies that the medical profession assigns a low cost C H to false positives and a very high cost C S to false negatives since mis diagnosing a person who is sick might mean they die worst case c The false positive paradox calculate the conditional probability of being sick given that you tested positive P S P S P 1 in a million 1in a 100 1 in 10 000 As mentioned in class yu might want a second opinion when the test comes back positive For the application to terrorism for the US from another piece by English author Doctorow who
View Full Document
Unlocking...