DOC PREVIEW
UI STAT 5400 - Bootstrap

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

122S:166More on the BootstrapLecture 9September 24, 2008Kate Cowles374 SH, [email protected] the number of bootstrapdatasets• approximately 1000 to 2000 is minimum forreasonable performance in most cases• choosing R = 999 or 1999 facilitates calcul-cation of percentile confidence intervals (seebelow)3Another version of the function for calcul a tingthe statistic for the ci ty data> meanratiofunction(df, indices){# df must be a data frame with two colums, "x" and "u"mean( df[ indices, "x" ]) / mean( df[ indices, "u" ] )}4Running the bootstrap with different settings ofR> library(boot)> data(city)>> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04122696 0.21684355> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04515005 0.2256023> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.03724419 0.2098392> boot.out <- boot( city, meanratio, R=1999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAP6Call:boot(data = city, statistic = meanratio, R = 1999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04460204 0.2267071> boot.out <- boot( city, meanratio, R=1999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 1999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.02751536 0.2116137>7Interpreting the boot object> names(boot.out)[1] "t0" "t" "R" "data" "seed" "statistic"[7] "sim" "call" "stype" "strata" "weights"> boot.out$t0 # thetahat from original data[1] 1.520312> hist(boot.out$t) # histrogram of thetastars from bootstrap samples> abline(v=boot.out$t0,lty=3) # add vertical line at thetahat> mean(boot.out$t)[1] 1.547828> mean(boot.out$t) - boot.out$t0 # bootstrap estimate of bias[1] 0.02751536> bbias <- mean(boot.out$t) - boot.out$t0> boot.out$t0 - bbias # bootstrap "unbiased" estimate[1] 1.492797> sd(boot.out$t) # bootstrap standard error[1] 0.21161378More on bootstrap confidenceintervals> boot.ci.out <- boot.ci(boot.out)Warning message:In boot.ci(boot.out) : bootstrap variances needed for studentized intervals> boot.ci.outBOOTSTRAP CONFIDENCE INTERVAL CALCULATIONSBased on 1999 bootstrap replicatesCALL :boot.ci(boot.out = boot.out)Intervals :Level Normal Basic95% ( 1.078, 1.908 ) ( 0.973, 1.796 )Level Percentile BCa95% ( 1.245, 2.068 ) ( 1.258, 2.121 )Calculations and Intervals on Original Scale9Bootstrap confidence intervalscontinued• Basic interval– if there was a functio n of the populationquantity we’ re interested in θ and the es-timatorˆΘ whose distribution was known,we could use the quantiles of this distribu-tion to construct c.i. for θ– since we do n’t h ave th is, arbitrarily co n-sider W = (ˆΘ − θ)– if we knew distribution of W , then two-sided level 100 × (1 − α) would be(ˆθ − w1−α2,ˆθ − wα2)– bootstrap idea: use distrib ution o f W∗=(ˆΘ∗−ˆθ) as approximation to distributionof W– > w <- sort(boot.out$t) - boot.out$t0> hist(w)10> boot.out$t0 - w[c(1950,50)][1] 0.9727399 1.7959120> boot.out$t0 - quantile(w,[c(.975,.025)])– pros: may work well for medi a ns– cons: bootstrap error (distribution of W∗being a poor approximation to distribu-tion of W ) often is large11• Percentile interval of level (1-α)– lower endpoint isα2(R+1) entry in orderedbootstrap statisticsitem upper en dpo int is 1 −α2!(R+1) en-try– > sort(boot.out$t)[c(50,1950)][1] 1.244713 2.067885– pros: simplicity– cons: may be very inaccurate if distribu-tion ofˆθ is not close to symmetric12Bias-correcting bootstrap percentile confidenceintervals• recall:dCDF (q) = P r∗(ˆθ∗≤ q)=#{ˆθb≤ q}B• ifdCDF (ˆθ) 6= .5, then bias correction to per-centile method c.i. may be in order• letz0= Φ−1(dCDF (ˆθ))– what Splus/R function evalutes Φ−1• then bias-corrected 1 − α c.i. isdCDF−1(Φ(2z0− zα/2)),dCDF−1(Φ(2z0+ zα/2))– here zα/2is upper α/2 point of standardnormalΦ(zα/2) = 1 − α/213Bootstrap confidence intervalscontinued• bias-corrected• bias-corrected accelerated (BCa)•


View Full Document

UI STAT 5400 - Bootstrap

Documents in this Course
Load more
Download Bootstrap
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bootstrap and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bootstrap 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?