Purdue STAT 51100 - Lecture 11 - D2325708

Home> Schools> Purdue University> Statistics (STAT) > STAT 51100> Lecture 11

DOC PREVIEW

Purdue STAT 51100 - Lecture 11

School name Purdue University

Course Stat 51100- Statistical Methods

Pages 25

This preview shows page 1-2-24-25 out of 25 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 25 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Lecture 13: Additional Confidence Intervals’ Related TopicsDevore: Section 7.3-7.4March, 2011Page 1Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011t-confidence intervals• Large-sample confidence intervals are based on the fact that,for n large enough,Z =¯X − µS/√nis approximately normally distributed• But what if n < 40?• For small n, this test statistic is denotedT =¯X − µS/√nto stress the fact it is no longer normally distributedMarch, 2011Page 2Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011t Distribution• A t distribution is governed by one parameter ν which is calledthe number of degrees of freedom (df)• Properties:1. tνcurve is bell-shaped and centered at 02. It has heavier tails than normal distribution (more spread out)3. As ν → ∞, the tνdensity curve approaches the normalcurveMarch, 2011Page 3Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011• Let tα,νbe the number on the horizontal axis such that the areato the left of it under tνcurve is α; tα,νis a t critical value.• For fixed ν, tα,νincreases as α decreases• For fixed α, as ν increases, the value tα,νdecreases. Theprocess slows down as ν increases; that is why the table valuesare shown in increments of 2 between 30 df and 40 df, but thenjump to ν = 50, ν = 60 etc.• zαis the last row of the table since t∞is the standard normaldistributionMarch, 2011Page 4Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Figure 1:March, 2011Page 5Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011One-sample t confidence interval• The number of df for T is n − 1 since S is based on deviationsX1−¯X, . . . , Xn−¯X that add up to zero• By definition of t critical value, we haveP (−tα/2,n−1< T < tα/2,n−1) = 1 − α• It is easy to show that 100(1 −α)% confidence interval for µ is¯x − tα/2,n−1·s√n, ¯x + tα/2,n−1·s√n• The alternative, more compact notation is¯x ± tα/2,n−1·s√nMarch, 2011Page 6Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Example• Sweetgum lumber is quite valuable but there’s a generalshortage of high-quality sweetgum today. Because of this,composite beams that are designed to add value to low-gradesweetgum lumber are commonly used.• The sample consists of 30 observations on the modulus ofrapture in psi• Checking normal probability plot first - the data looks normal!• R Code:1. sweetgum < − as.vector(as.matrix(swgum))2. mean(sweetgum)+qt(.025,29)*sd(sweetgum)/sqrt(length(sweetgum))March, 2011Page 7Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Prediction interval• Consider a random sample X1, . . . , Xnfrom a normalpopulation distribution. Suppose you want to predict Xn+1.• A point predictor is¯X; clearly, E(¯X − Xn+1) = µ − µ = 0andV (¯X−Xn+1) = V (¯X)+V (Xn+1) = σ2+σ2n= σ21 +1n• The prediction error is normally distributed and, therefore,Z =¯X − Xn+1qσ21 +1nhas a standard normal distributionMarch, 2011Page 8Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011• It is possible to show thatT =¯X − Xn+1Sq1 +1nhas t distribution with n − 1 df• Consequently, the prediction interval for Xn+1is¯x ± tα/2,n−1· sr1 +1n• Note the obvious difference with the t confidence interval for themean µ...Why is the prediction interval wider?March, 2011Page 9Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011• Note that the estimation error¯X − µ is the deviation from thefixed value while the prediction error¯X − Xn+1is a differencebetween two random variables. The second has much morevariability in it than the first...• Even when n → ∞, the PI approaches µ ± zα/2· σ. Thismeans that there is uncertainty about the true value X evenwhen the infinite amount of information is available.March, 2011Page 10Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Example• A meat inspector has randomly measured 30 packs of 95%lean beef. The sample resulted in the mean 96.2% with thesample standard deviation of 0.8%. Find a 99% predictioninterval for a new pack. Assume normality• For ν = 29 df, we have the critical value t0.005= 2.756.Hence a 99% prediction interval for a new observation x0is96.2−(2.756)(0.8)r1 +130< x0< 96.2−(2.756)(0.8)r1 +130which reduces to (93.96, 98.44).March, 2011Page 11Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Bootstrap:The Introduction• Suppose you have some distribution with the density f(x; θ)where θ is an unknown parameter• Given a sample x1, . . . , xnfrom this distribution, you canobtain a point estimateˆθ; as an example, if you have normaldistribution with mean µ, you can always estimate it by ¯x.• If θ is the only unknown parameter, you can say that the(unknown) pdf f(x; θ) can be estimated by f(x;ˆθ). Now youcan generate multiple samples from f(x;ˆθ) distribution to getx∗1, x∗2, . . . , x∗n(1)March, 2011Page 12Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011• With B bootstrap samples at our disposal, we can have thebootstrap estimate of θˆθ∗. For example, if the parameter inquestion is the mean µ, we have ˆµ∗= B−1Px∗i.• Why do we need bootstrap? An important issue is estimatingthe precision of the estimatorˆθ; if θ = σ2, it is difficult toestimate the variance σˆθ.• Using the bootstrap samples, we can estimate it asSˆθ=r1B − 1X(ˆθ∗i−¯θ∗)2March, 2011Page 13Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011Example• Let X be the time to breakdown of an insulating fluid betweenelectrodes at some voltage and assume it is exponentiallydistributed f(x) = λe−λx• A random sample of n = 1 breakdown times (min) is 41.53,18,73, 2.99, 30.34, 12.33, 117.52, 73.02, 223.63, 4.00, 26.78• A reasonable estimate of the distribution parameter isλ =1¯x= 1/55.087 = 0.018153• Generate B = 100 samples, each of size 10, fromf(x; 0.018153)• Determine the value ofˆλ∗ifor each i = 1, . . . , B and find¯λ∗= 0.02153 and sˆλ= 0.0091; that last value can be usedMarch, 2011Page 14Statistics 511: Statistical MethodsDr. LevinePurdue UniversitySpring 2011to construct a confidence

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-24-25 out of 25 pages.

Purdue STAT 51100 - Lecture 11

Sign up for free to view:

Please select your school