Unformatted text preview:

Goodness of Fit TestsBios 662Michael G. Hudgens, [email protected]://www.bios.unc.edu/∼mhudgens2008-10-01 11:05BIOS 662 1 GOFAssessing Fit• Graphical displays such as qqplot• Tests– χ2– Kolmogorov-Smirnov one-sample (page 279 text)– OthersBIOS 662 2 GOFKS GOF Test• Kolmogorov-Smirnov GOF test (one sample test)• We want to test that our data come from a known andcompletely specified distribution: F0(y)BIOS 662 3 GOFKS GOF Test• The empirical distribution function (EDF) for a givendata set isFn(y) =0 if y < y(1)k/n if y(k)≤ y < y(k+1)1 if y > y(n)Note: text calls this empirical cumulative distribution(ECD); see age 32BIOS 662 4 GOFKS GOF Test• H0: Y1, . . . , Yn∼ F0(y)• The KS statistic for GOF isD = maxy|F0(y) − Fn(y)|• Exact and asymptotic distribution of D have been de-rived, tabulated• Critical values on next slide are appropriate for F0(y)continuousBIOS 662 5 GOFKS GOF Test• Critical values for KS one sample testn 0.05 0.0110 .409 .48915 .338 .40416 .327 .39217 .318 .38118 .309 .37119 .301 .36320 .294 .35225 .264 .31730 .242 .29035 .224 .269> 351.36ζ1.63ζwhere ζ = (n +pn/10)1/2. Source: Conover, Practical Nonpara-metric Statistics, 1980, page 462.BIOS 662 6 GOFKS GOF Test• The KS statistic for GOF isD = maxy|F0(y) − Fn(y)|• EquivalentlyD = max{D1, . . . , Dn}whereDi≡ max{i/n − z(i), z(i)− (i − 1)/n}andz(i)= F0(y(i))BIOS 662 7 GOFKS GOF: Example• A random sample of size 10y10.621y20.503y30.203y40.477y50.710y60.581y70.329y80.480y90.554y100.382BIOS 662 8 GOFKS GOF: Example• It is hypothesized the distribution of these samples isU(0, 1)F0(y) =0 if y < 0y if 0 ≤ y ≤ 11 if 1 < y• n = 10• C.05= {D > 0.409}• On next slide we show D = 0.290; thus we do not rejectH0BIOS 662 9 GOFKS GOF: Exampley(i)F0(y(i)) i/n (i − 1)/n Diy(1)0.203 .1 0 .203y(2)0.329 .2 .1 .229y(3)0.382 .3 .2 .182y(4)0.477 .4 .3 .177y(5)0.480 .5 .4 .180y(6)0.503 .6 .5 .097y(7)0.554 .7 .6 .146y(8)0.581 .8 .7 .219y(9)0.621 .9 .8 .279y(10)0.710 .10 .9 .290BIOS 662 10 GOFKS GOF: Example0.0 0.2 0.4 0.6 0.8 1.00.0 0.2 0.4 0.6 0.8 1.0xFn(x)●●●●●●●●●●EDFUniformBIOS 662 11 GOFKS GOF• The KS test requires that the parameters of F0(y) areknown• If they are estimated from the data, the distribution ofD is not as in the table above.• Critical values for KS statistic for testing normality whenµ and σ2are estimated are given by Lilliefors (JASA1967, p399)BIOS 662 12 GOFLilliefors KS GOF Test• Critical values for KS test of normalityn 0.05 0.0110 .258 .29415 .220 .25716 .213 .25017 .206 .24518 .200 .23919 .195 .23520 .190 .23125 .173 .20030 .161 .187> 30.886√n1.031√n• Source: Conover, Practical Nonparametric Statistics,1980, page 463.BIOS 662 13 GOFKS GOF: Example• A random sample of size 10y10.621y20.503y30.203y40.477y51.160y60.581y70.329y80.480y90.554y100.382BIOS 662 14 GOFKS GOF: Example• It is hypothesized the distribution of these samples isnormal•ˆµ =¯y = 0.529 andˆσ = s = 0.2546501• C.05= {D > 0.258}• For the se data D = 0.259; p ≈ 0.05BIOS 662 15 GOFKS GOF: Exampley(i)F0(y(i)) i/n (i − 1)/n Diy(1)0.100 .1 0 .100y(2)0.216 .2 .1 .116y(3)0.282 .3 .2 .082y(4)0.419 .4 .3 .119y(5)0.424 .5 .4 .076y(6)0.459 .6 .5 .141y(7)0.539 .7 .6 .161y(8)0.581 .8 .7 .219y(9)0.641 .9 .8 .259y(10)0.993 1 .9 .093BIOS 662 16 GOFKS GOF: Example0.0 0.2 0.4 0.6 0.8 1.0 1.20.0 0.2 0.4 0.6 0.8 1.0xFn(x)●●●●●●●●●●EDFNormalBIOS 662 17 GOFKS GOF: SAS• SAS: use Proc Univariate w/ NORMAL option or HIS-TOGRAM s tatementproc univariate normal; var x;Tests for NormalityTest --Statistic--- -----p Value------Shapiro-Wilk W 0.835123 Pr < W 0.0386Kolmogorov-Smirnov D 0.258945 Pr > D 0.0560Cramer-von Mises W-Sq 0.116363 Pr > W-Sq 0.0587Anderson-Darling A-Sq 0.710057 Pr > A-Sq 0.0444BIOS 662 18 GOFKS GOF: R• R: ks.test(); however, beware of ties:> ks.test(rnorm(100000,0,1),"pnorm",0,1)One-sample Kolmogorov-Smirnov testdata: rnorm(1e+05, 0, 1)D = 0.0021, p-value = 0.783alternative hypothesis: two.sided> ks.test(rpois(100000,3),"ppois",3)One-sample Kolmogorov-Smirnov testdata: rpois(1e+05, 3)D = 0.2237, p-value < 2.2e-16alternative hypothesis: two.sidedWarning message:cannot compute correct p-values with ties in: ks.test(rpois(1e+05, 3), "ppois", 3)BIOS 662 19 GOFKS GOF: R• LillieforsSAS: automaticR: use “nortest” package> ks.test(x,"pnorm",mean(x),sd(x))One-sample Kolmogorov-Smirnov testdata: xD = 0.2589, p-value = 0.4402alternative hypothesis: two-sided> install.packages("nortest")> library(nortest)> lillie.test(x)Lilliefors (Kolmogorov-Smirnov) normality testdata: xD = 0.2589, p-value = 0.05602BIOS 662 20 GOFKS vs χ2GOF Te sts• If data continuous, KS prefe rred. Why?– If sample size small, KS is exact, while χ2relies onlarge sample approximation– KS test more powerful than χ2in most situations(Conover, Practical Nonparametric Statistics, 1980p 346)– Do not need to bin• If discrete/cate gorical, χ2preferredBIOS 662 21 GOFOther GoF Tests• Wilk-Shapiro: see Conover page 363, Tables A.17, A.18[Pki=1ai(X(n−i+1)− X(i))]2s2where s2is the sample variance and aiare given• Under null (i.e., normality), numerator and denomina-tor both estimating (up to a constant) σ2• R: shapiro.test()BIOS 662 22 GOFOther GoF Tests• Class of GoF test statisticsnZ{Fn(y) − F0(y)}2ψ(y)dy• Anderson-Darling ψ(y) = {F0(y)(1 − F0(y))}−1• Cramer-von Mises ψ(y) = 1• R nortes t package: ad.test(), cvm.test()BIOS 662 23


View Full Document

UNC-Chapel Hill BIOS 662 - Goodness of Fit Tests

Download Goodness of Fit Tests
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Goodness of Fit Tests and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Goodness of Fit Tests 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?