3 06 95 conf CONFIDENCE INTERVALS AND HYPOTHESIS TESTING FOR VARIANCES 1 A Confidence Interval for the Population Variance The Chi squared 2 distribution refers to the distribution of a sum of z 2 s that is sums like 2 z12 z22 z32 zn2 where z is a N 0 1 variable that is it is normal with a mean of zero and a standard deviation of one The Chi squared distribution is tabulated according to degrees of freedom DF and has mean DF variance 2DF For example if a chi squared statistic has seven degrees of freedom its mean is 7 and its variance is 14 To find a confidence interval for a population variance we must first estimate the sample variance s 2 x x z distributed n 1 z 2 x 2 nx 2 where n is the size of the sample If x is normally n 1 x x will be have the standardized normal distribution with a mean of zero and a standard deviation of one 2 2 x x 2 z N 0 1 2 n 1 s 2 2 The sum of these z s squared will be a chi squared and will have a 2 distribution with n 1 degrees of freedom If we are looking for a 95 confidence interval for a variance we can observe that since the above ratio has a 2 distribution there must be two numbers 2975 and 2025 that together cut off an interval about the mean that contains 95 of the probability We can indicate this by saying n 1 s 2 2 95 P 2975 But if the expression in brackets is true 95 of the time so is 025 2 its inverse 1 2025 2 1 2 And if this is true it is also true after being multiplied through 2 n 1 s 975 by n 1 s 2 So we get n 1 s 2 2025 generally for confidence level 2 1 n 1 s 2 2975 n 1 s 2 2 2 as an interval for the variance or more 2 n 1 s 2 2 1 For example if n 31 and 2 s 1000 our degrees of freedom are n 1 30 so that if the confidence level is 95 we look up 30 30 2025 46 979 and 2975 16 791 Substituting these into the formula we get 30 1000 2 30 1000 2 which can be reduced to 2 46 999 16 791 0 6386 1000 2 2 1 7867 1000 2 If we want a confidence interval for the standard deviation instead we can take the square root of both sides of the equation and say 0 799 1000 1 337 1000 or 799 1337 2 Finding values for 2 On most tables of the chi squared distribution the right hand critical values of 2 appear alone For example for 025 and 5 degrees of freedom the appropriate critical value of chi squared is 12 8 This means that any value of 2 above 12 8 is in the right hand tail of the distribution where only 2 5 of the probability lies A value of 2 above 12 8 would cause us to reject a null hypothesis for a 2 5 one sided test or a 5 two sided test Unfortunately many chi squared tables give only these upper 2 critical values and leave out lower critical values like 975 so a table with both upper and lower critical values is included with this document However this table does not show all values of 2 above 30 degrees of freedom For more than 30 degrees of freedom the normal distribution is used to approximate the chi squared distribution z 2 2 most commonly by setting 2 DF 1 where DF n 1 For example if s 2 1100 2 1000 and n 101 then 2 of n 1 s 2 2 100 1100 1000 110 then if this value 2 is substituted into the above formula for z we get z 2 110 2 100 1 220 199 14 8324 14 1067 0 7257 An example of the use of this appears under Hypothesis Testing for Variances One Sample This formula is also substituted into the confidence interval formula to give the following approximate formula for a confidence interval for a standard deviation s 2 DF s 2 DF s 2 DF which could also be written as z 2 DF z 2 DF 2 DF z For 2 2 example if n 400 s 1000 and 05 that 2 2 DF 2 400 1 798 28 25 so 1000 28 25 or 935 1075 If a confidence interval for the variance is required 28 25 1 96 the numbers on both sides of the interval can be squared This interval should only be used when the values of 2 for the appropriate degrees of freedom are above those available on the chi squared table 3 Hypothesis Testing for Variances One Sample Suppose that we want to test the statement that the variance of a given population is equal to some given variance which we can call 20 That is our null hypothesis is H 0 2 20 and our alternative hypothesis is 2 ratio n 1 s 2 02 H1 2 20 If the underlying data is normally distributed we use the test which has the chi squared distribution with n 1 degrees of freedom often 2 n 1 written For a two tailed test we would pick two values of chi squared 2 and 2 and 12 accept the null hypothesis if 2 2 lies between them For example assume that we believe that the distribution of the ages of a group of workers is normal and we wish to test our belief that the variance is 64 Our data is a sample of 17 workers and our computations give us a sample variance of 100 Let us set our significance level at 2 and state our H 0 2 64 problem as follows H1 2 64 n 17 DF n 1 16 s 2 100 20 64 02 n 1 s 2 16 100 25 2 We can compute Since 01 and DF 16 we go to the chi2 64 0 2 2 16 5 812 and 201 16 32 000 The accept region is between these two squared table to find 99 values so we cannot reject the null hypothesis As a second example assume that we are testing the same null hypothesis but the sample size is 73 so that chi squared has 72 degrees of freedom Thus we have the following n 73 DF n 1 72 s 2 100 20 64 02 The formula for the chi squared statistic 2 gives us n 1 s 2 02 72 100 112 5 If we cannot find an appropriate 2 on our table 64 because of the high number of degrees of freedom we use the z formula suggested in section 2 This is z 2 2 2 DF 1 2 112 5 2 72 1 225 143 15 00 11 96 3 04 Since this statistic is N …
View Full Document
Unlocking...