1 252meanx2 11 04 07 Open this document in Outline view D COMPARISON OF TWO SAMPLES CTD 5 Rank Tests Especially in the case where samples are small and the underlying distributions are not normal it is not appropriate to compare means a The Wilcoxon Mann Whitney Test for Two Independent Samples If samples are independent This test is appropriate to test whether the two samples come from the same distribution If the distributions are similar it is often called a test of equality of medians Example Let us assume that we have two very small samples from New York n 2 6 and Pennsylvania n1 4 and we wish to compare their medians Let us call the smaller sample Pennsylvania sample 1 and the larger sample sample 2 so that n1 n 2 If we use for the median our hypotheses are H 0 1 2 and 05 H 1 1 2 Assume that our data is as below Pennsylvania New York 11000 16000 80000 85000 17000 30000 50000 70000 80000 90000 Our first step is to rank the numbers from 1 to n n1 n 2 4 6 10 note that the 7th and 8th numbers are tied so that both are numbered 7 5 These can be ordered from the largest to the smallest or from the smallest to the largest To decide which to do look at the smaller sample If the smallest number is in the smaller sample order from smallest to largest if the largest number is in the smallest sample order from the largest to the smallest Since 11000 is the smallest number let that be 1 x1 x2 r2 Pennsylvania r1 New York 11000 1 17000 3 16000 2 30000 4 80000 7 5 60000 5 85000 9 70000 6 19 5 80000 7 5 90000 10 35 5 SR 19 5 SR 35 5 Now compute the sums of the ranks As a check note that these two rank 1 2 sums must add to the sum of the first SR1 SR 2 19 5 35 5 55 n numbers and that this is n n 1 10 11 55 and that 2 2 2 252meanx2 11 04 07 Open this document in Outline view The smaller of SR1 and SR 2 is called W and is compared with Table 5 or 6 Wpval WCV To use Table 5 first find the part for n 2 6 and then the column for n1 4 Then try to locate W 19 5 in that column In this case since for W 19 the p value is 3048 and for W 20 the pvalue is 3810 we can say that 3048 pvalue 3810 Since both are above the significance level we cannot reject the null hypothesis This can also be compared against the critical values for TL and TU in table 6b these are 13 and 31 Since W 19 5 it is between these values and we cannot reject the null hypothesis For values of n1 and n 2 that are too large for the tables W has the normal distribution with mean W 1 2 n1 n1 n 2 1 and variance W2 1 6 n 2 W Though the example above is too small for this treatment for continuity its data will be used here If the significance level is 5 and the test is one sided we reject our null hypothesis if z W W lies below z 05 1 645 In this case then W W 1 2 n1 n1 n 2 1 1 2 4 4 6 1 22 and W2 1 6 n 2 W 1 6 6 22 22 so that z W W 19 5 22 0 53 Since this is not below 1 645 we cannot reject H 0 W 22 b Wilcoxon Signed Rank Test for Paired Samples This is a test for equality of medians when the data is paired It can also be used for the median of a single sample The Sign Test for paired data is a simpler test to use in this situation but it is less powerful As in many tests for measures of central tendency with paired data the original numbers are discarded and the differences between the pairs are used If there are n pairs these are ranked according to absolute value from 1 to n either top to bottom or bottom to top After replacing tied absolute values with their average rank each rank is marked with a or sign and two rank sums are taken T and T The smaller of these is compared with Table 7 Example We wish to compare sales of a product before and after an advertisement appeared in a nationally televised football game Sales in a sample of eight stores before the game are x1 and sales after are x 2 Define d x 2 x1 as the improvement in sales Though the appropriate test here would be one sided a two sided test is demonstrated here instead H 0 1 2 H 1 1 2 the column r n 8 and 05 The data are below The column ranks absolute values and the column with the signs on the differences d is the absolute value of d r is the ranks corrected for ties and marked 3 252meanx2 11 04 07 Open this document in Outline view x1 x2 7600 8700 9600 8400 7600 6900 7300 8200 8600 8900 9400 8700 8100 7500 7700 8100 If we add together the numbers in d x 2 x1 1000 200 200 300 500 600 400 100 d r r 1000 200 200 300 500 600 400 100 8 2 3 4 6 7 5 1 8 2 5 2 54 6 7 5 1 r with a sign we get T 32 5 If we do the same for numbers with a sign we get T 3 5 To check this note that these two numbers must sum to the sum of the first n numbers and that this is T T 32 5 3 5 36 n n 1 8 9 36 and that 2 2 We check 3 5 the smaller of the two rank sums against the numbers in table 7 wsignedr For a twosided 5 test we use the 025 column For n 8 the critical value is 4 and we reject the null hypothesis only if our test statistic is below this critical value Since our test statistic is 3 5 we reject the null hypothesis For values of n that are too large for the table T L the smaller of T and T has the normal T 1 4 n n 1 and variance T2 1 6 2n 1 T Though the example above is too small for this treatment for continuity its data will be used here If the significance level is 5 and distribution with mean the test is two sided we reject our null hypothesis if z TL T does not lie between T z 2 z 025 1 960 In this case then T 1 4 n n 1 1 4 8 8 1 18 and T2 …
View Full Document
Unlocking...