Nov 20 2003 LEC 16 ECON 240A 1 Nonparametric Statistics L PHILLIPS I Introduction A principal use of nonparametric methods is for samples whose frequency distribution is not normal This can be ascertained visually by looking at the histogram of the data series and seeing whether the histogram is bell shaped or not A test for normality could be conducted using the Chi Square or another approach For example suppose you would like to conduct a test for the difference between two means but the two independent samples are not normal In this case you could use the Wilcoxon rank sum test for independent samples Another example arises from experimental design where you have matched pairs If the data are not normally distributed then you could use the sign test if the data arise from a rating or ranking scheme If the matched pairs result from quantitative data that is not normally distributed then you can resort to the Wilcoxon signed rank sum test for matched pairs II Wilcoxon Rank Sum Test for Independent Samples This test is applied to problems of testing the difference between the means of two populations when they are non normal The text uses an example of rating a new painTable 1 Rating Scheme for New Painkilling Drug Score 5 4 3 2 1 Legend Extremely effective Quite effective Somewhat effective Slightly effective Not at all effective Nov 20 2003 LEC 16 ECON 240A 2 Nonparametric Statistics L PHILLIPS killing drug compared to aspirin as a control The rating scheme is displayed in Table 1 The data file is xm17 02 Thirty people were randomly selected and fifteen were given the new drug to rate and fifteen were given aspirin There are fifteen ratings for the new drug and fifteen for aspirin as displayed in Table 2 Table 2 Ratings For the New Painkiller and For Aspirin New Drug Aspirin 3 4 5 1 4 3 3 2 2 4 5 1 1 3 4 4 5 2 3 2 3 2 5 4 5 3 5 4 4 5 The procedure is to take the thirty ratings and to rank them starting with the smallest number There are three ones ranked 1 2 3 and since they are tied they receive the average rank of two There are five twos ranked 4 5 6 7 and 8 and since they are tied they receive the average rank of 6 The process of ranking proceeds in this manner The ratings sorted in ascending order and the raw ranks not accounting for ties are displayed in Table 3 along with the ranks where ties have been taken into account Table 3 Ratings of the Painkiller and Aspirin Sorted in Ascending Order and Ranked Nov 20 2003 LEC 16 Rating Raw Rank ECON 240A 3 Nonparametric Statistics Rank Ties L PHILLIPS Nov 20 2003 LEC 16 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ECON 240A 4 Nonparametric Statistics L PHILLIPS 2 2 2 6 6 6 6 6 12 12 12 12 12 12 12 19 5 19 5 19 5 19 5 19 5 19 5 19 5 19 5 27 27 27 27 27 27 27 The next step is to use Table 3 to modify Table 2 incorporating the rankings where ties are accounted for This is displayed in Table 4 The entity used for testing is the rank sum of the new drug 276 5 denoted T Note that this rank sum is higher than the rank sum for aspirin indicating higher ratings for the new painkiller The question is whether these ratings are significantly higher Table 4 Ratings and Corresponding Rankings For the New Painkiller and Aspirin New Drug Rating 3 5 Ranking 12 27 Aspirin Rating 4 1 Ranking 19 5 2 Nov 20 2003 LEC 16 ECON 240A 5 Nonparametric Statistics L PHILLIPS 4 19 5 3 12 3 12 2 6 2 6 4 19 5 5 27 1 2 1 2 3 12 4 19 5 4 19 5 5 27 2 6 3 12 2 6 3 12 2 6 5 27 4 19 5 5 27 3 12 5 27 4 19 5 4 19 5 5 27 Rank Sum 276 5 188 5 For sample sizes greater than ten T is approximately normally distributed The expected value of T is E T n1 n1 n2 1 2 15 31 2 232 5 1 Where the subscript 1 refers to sample one the new drug and 2 refers to sample two aspirin The standard deviation of T is T n1 n2 n1 n2 1 12 1 2 15 15 31 12 24 1 2 The z statistic is z T E T T 276 5 232 5 24 1 1 83 3 Figure 1 One Tailed Test 5 Level Normal Distribution A one tailed test is used since the null hypothesis is that the central tendency or location 0 5 for the new drug is the same as the central tendency for aspirin i e there is no difference 0 4 FREQUENCY in locations between these two populations The alternative hypothesis is that the central tendency or location for the new drug is greater than the locations between these two 0 3 populations The critical value for the normal distribution at a significance level of 5 is 0 2 1 645 as illustrated in Figure 1 0 1 0 0 4 2 0 Z 2 4 Nov 20 2003 LEC 16 ECON 240A 6 Nonparametric Statistics L PHILLIPS 5 1 645 The authors use their macro STATS to calculate this test The data are in adjacent columns The Wilcoxon Rank Sum Test is found under the Tools menu data Analysis Plus In the text as another example the authors use the data file workers in Ch 17 to examine whether there is any difference in the duration of employment for 25 business graduates versus 20 non business graduates III Sign Test For Matched Pairs We have used matched pairs as an experimental design to diminish unexplained variance Once again we resort to nonparametric methods if the data are not normally distributed The text uses the data file xm17 03 which includes comfort ratings by 25 respondents who compare a European car to an American car The rating scheme is listed in Table 5 Nov 20 2003 LEC 16 ECON 240A 7 Nonparametric Statistics L PHILLIPS Table 5 Rating Scheme For Car Comfort Score Legend 1 Ride is very uncomfortable 2 Ride is quite uncomfortable 3 Ride is neither uncomfortable nor comfortable 4 Ride is quite comfortable 5 Ride is very comfortable The respondents and their ratings for the two cars are listed in Table 6 Table 6 Ratings For Comfort European Car Vs American Car Respondent European Car Rating American Car Rating 1 4 5 2 2 1 3 5 4 4 3 …
View Full Document
Unlocking...