UI STAT 5400 - Bootstrap - D1101127

Home> Schools> University of Iowa> Statistics (STAT) > STAT 5400> Bootstrap

DOC PREVIEW

UI STAT 5400 - Bootstrap

School name University of Iowa

Course Stat 5400- Computing in Statistics

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

122S:166More on the BootstrapLecture 9September 24, 2008Kate Cowles374 SH, [email protected] the number of bootstrapdatasets• approximately 1000 to 2000 is minimum forreasonable performance in most cases• choosing R = 999 or 1999 facilitates calcul-cation of percentile confidence intervals (seebelow)3Another version of the function for calcul a tingthe statistic for the ci ty data> meanratiofunction(df, indices){# df must be a data frame with two colums, "x" and "u"mean( df[ indices, "x" ]) / mean( df[ indices, "u" ] )}4Running the bootstrap with different settings ofR> library(boot)> data(city)>> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04122696 0.21684355> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04515005 0.2256023> boot.out <- boot( city, meanratio, R=999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.03724419 0.2098392> boot.out <- boot( city, meanratio, R=1999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAP6Call:boot(data = city, statistic = meanratio, R = 1999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.04460204 0.2267071> boot.out <- boot( city, meanratio, R=1999)> boot.outORDINARY NONPARAMETRIC BOOTSTRAPCall:boot(data = city, statistic = meanratio, R = 1999)Bootstrap Statistics :original bias std. errort1* 1.520312 0.02751536 0.2116137>7Interpreting the boot object> names(boot.out)[1] "t0" "t" "R" "data" "seed" "statistic"[7] "sim" "call" "stype" "strata" "weights"> boot.out$t0 # thetahat from original data[1] 1.520312> hist(boot.out$t) # histrogram of thetastars from bootstrap samples> abline(v=boot.out$t0,lty=3) # add vertical line at thetahat> mean(boot.out$t)[1] 1.547828> mean(boot.out$t) - boot.out$t0 # bootstrap estimate of bias[1] 0.02751536> bbias <- mean(boot.out$t) - boot.out$t0> boot.out$t0 - bbias # bootstrap "unbiased" estimate[1] 1.492797> sd(boot.out$t) # bootstrap standard error[1] 0.21161378More on bootstrap confidenceintervals> boot.ci.out <- boot.ci(boot.out)Warning message:In boot.ci(boot.out) : bootstrap variances needed for studentized intervals> boot.ci.outBOOTSTRAP CONFIDENCE INTERVAL CALCULATIONSBased on 1999 bootstrap replicatesCALL :boot.ci(boot.out = boot.out)Intervals :Level Normal Basic95% ( 1.078, 1.908 ) ( 0.973, 1.796 )Level Percentile BCa95% ( 1.245, 2.068 ) ( 1.258, 2.121 )Calculations and Intervals on Original Scale9Bootstrap confidence intervalscontinued• Basic interval– if there was a functio n of the populationquantity we’ re interested in θ and the es-timatorˆΘ whose distribution was known,we could use the quantiles of this distribu-tion to construct c.i. for θ– since we do n’t h ave th is, arbitrarily co n-sider W = (ˆΘ − θ)– if we knew distribution of W , then two-sided level 100 × (1 − α) would be(ˆθ − w1−α2,ˆθ − wα2)– bootstrap idea: use distrib ution o f W∗=(ˆΘ∗−ˆθ) as approximation to distributionof W– > w <- sort(boot.out$t) - boot.out$t0> hist(w)10> boot.out$t0 - w[c(1950,50)][1] 0.9727399 1.7959120> boot.out$t0 - quantile(w,[c(.975,.025)])– pros: may work well for medi a ns– cons: bootstrap error (distribution of W∗being a poor approximation to distribu-tion of W ) often is large11• Percentile interval of level (1-α)– lower endpoint isα2(R+1) entry in orderedbootstrap statisticsitem upper en dpo int is 1 −α2!(R+1) en-try– > sort(boot.out$t)[c(50,1950)][1] 1.244713 2.067885– pros: simplicity– cons: may be very inaccurate if distribu-tion ofˆθ is not close to symmetric12Bias-correcting bootstrap percentile confidenceintervals• recall:dCDF (q) = P r∗(ˆθ∗≤ q)=#{ˆθb≤ q}B• ifdCDF (ˆθ) 6= .5, then bias correction to per-centile method c.i. may be in order• letz0= Φ−1(dCDF (ˆθ))– what Splus/R function evalutes Φ−1• then bias-corrected 1 − α c.i. isdCDF−1(Φ(2z0− zα/2)),dCDF−1(Φ(2z0+ zα/2))– here zα/2is upper α/2 point of standardnormalΦ(zα/2) = 1 − α/213Bootstrap confidence intervalscontinued• bias-corrected• bias-corrected accelerated (BCa)•

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

UI STAT 5400 - Bootstrap

Sign up for free to view:

Please select your school