UW-Madison STAT 371 - Chapter 7 Rules for Means and Variances - Prediction - D2578933

Home> Schools> University of Wisconsin, Madison> Statistics (STAT) > STAT 371> Chapter 7 Rules for Means and Variances - Prediction

DOC PREVIEW

UW-Madison STAT 371 - Chapter 7 Rules for Means and Variances - Prediction

School name University of Wisconsin, Madison

Course Stat 371- Intro to Statistics

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Chapter 7Rules for Means and Variances; Prediction7.1 Rules for Means and VariancesThe material in this section is very technical and algebraic. And dry. But it is useful for under-standing many of the methods we will learn later in this course.We have random variables X1, X2, . . . Xn. Throughout this section, we will assume that theserv’s are independent. Sometimes they will also be identically distributed, but we don’t need i.d. forour main result. (There is a similar result without independence too, but we won’t need it.)Let µidenote the mean of Xi. Let σ2idenote the variance of Xi.Let b1, b2, . . . , bndenote n numbers. DefineW = b1X1+ b2X2+ . . . bnXn.W is a linear combination of the Xi’s. The main result is• The mean of W is µW=Pni=1biµi.• The variance of W is σ2W=Pni=1b2iσ2i.Special Cases1. i.i.d. case. If the sequence is i.i.d. then we can write µ = µiand σ2= σ2i. In this case themean of W is µW= (Pni=1bi)µ and the variance of W is σ2W= (Pni=1b2i)σ2.2. Two independent rv’s. If n = 2, then we usually call them X and Y instead of X1andX2. We get W = b1X + b2Y which has mean µW= b1µX+ b2µYand variance σ2W=b21σ2X+ b22σ2Y.3. Two i.i.d. rv’s. Combining the notation of the previous two items, W = b1X + b2Y hasmean µW= (b1+ b2)µ and variance σ2W= (b21+ b22)σ2.Especially important is the case W = X + Y which has mean µW= 2µ and varianceσ2W= 2σ2.Another important case is W = X − Y which has mean µW= 0 and variance σ2W= 2σ2.717.2 Predicting for Bernoulli TrialsPredictions are tough, especially about the future—Yogi Berra.We plan to observe m BT and want to predict the total number of successes that we will get. LetY denote the r.v. and y the observed value of the total number of successes in the future m trials.Similar to estimation, we will learn about point and interval predictions.7.2.1 When p is KnownWe begin with point prediction of Y . We adopt the criterion that we want the probability of beingcorrect to be as large as possible. Below is the result.Calculate the mean of Y , which is mp. If mp is an integer then it is the most probable value ofY and our prediction is ˆy = mp. Here are some examples.• Suppose that m = 20 and p = 0.50. Then, mp = 20(0.5) = 10 is an integer, so 10 is ourpoint prediction of Y . With the help of our website calculator (details not given), we findthat P (Y = 10) = 0.1762.• Suppose that m = 200 and p = 0.50. Then, mp = 200(0.5) = 100 is an integer, so100 is our point prediction of Y . With the help of our website calculator, we find thatP (Y = 100) = 0.0563.• Suppose that m = 300 and p = 0.30. Then, mp = 300(0.3) = 90 is an integer, so 90 is ourpoint prediction of Y . With the help of our website calculator, we find that P (Y = 90) =0.0502.If mp is not an integer, then it can be shown that the most probable value of Y is one of theintegers immediately on either side of mp. Just check them both. Here are some examples.• Suppose that m = 20 and p = 0.42. Then, mp = 20(0.42) = 8.4 is not an integer. Themost likely value of Y is either 8 or 9. With the help of our website calculator, we find thatP (Y = 8) = 0.1768 and P (Y = 9) = 0.1707. Thus, ˆy = 8.• Suppose that m = 100 and p = 0.615. Then, mp = 100(0.615) = 61.5 is not an integer.The most likely value of Y is either 61 or 62. With the help of our website calculator, wefind that P (Y = 61) = 0.0811 and P (Y = 62) = 0.0815. Thus, ˆy = 62.In each of the above examples we saw that the probability that a point prediction is correctis very small. As a result, scientists usually prefer a prediction interval. It is possible to create aone-sided prediction interval, but we will consider only two-sided prediction intervals.We have two choices: using a snc approximation or finding an exact interval. Even if youchoose the exact interval, it is useful to begin with the snc approximation.The snc approximation says to use the interval:mp ± z√mpq72where z is the same number as we used for the two-sided confidence interval for p. Here is anexample.• Suppose that m = 100 and p = 0.615. The snc approximate 95% prediction interval is61.5 ± 1.96q61.5(0.385) = 61.5 ± 9.54 = [52.96, 71.04].Now, it makes no sense to predict a fractional number of successes, so we will round-offthe endpoints to get [53, 71]. We can use our binomial calculator to see whether the sncapproximation is any good. Doing this, we find that the exact probability that Y will be inthe interval [53, 71] is 0.9483. If this answer had been importantly too small or too large, wecould modify either or both endpoints to get a more desirable answer. The point is that thesnc approximation will either give us a good answer (as in this case) or help us find a goodanswer.7.2.2 When p is UnknownWe now consider the situation in which p is unknown. We will begin with point prediction. Thefirst problem is that we cannot achieve our criterion’s goal: we cannot find the most probable valueof Y . The most probable value, as we saw above, is at or near mp, but we don’t know what p is.Thus, we adopt an ad-hoc approach. We simply decide to use mˆp as our point prediction, where ˆpis our point estimate of p. Of course, if mˆp is not an integer, we need to round-off to a near integer.B/c this is ad-hoc, I say just round to the nearest integer. If two integers are equally close, round tothe one that is an even number.But where did ˆp come from? Well, we need to add another ingredient to our procedure. Weassume that we have past data from the CM that will generate the m future trials. We denote thepast data as consisting of n trials which yielded x successes, giving ˆp = x/n as our point estimateof the unknown p.As I said above, our point prediction is mˆp = mx/n. This answer is ad-hoc; we use it b/c itis sensible. This approach suggests taking our prediction interval for p known, mp ± z√mpq, andsimply changing it to mˆp ± z√mˆpˆq and saying, “It’s ad-hoc.”We cannot do this! Why? Well, b/c we want an interval so that the probability is, for example,95% that the interval will be correct; i.e. contain the eventual value of Y . It turns out, and youcan show this with a simulation study if you want, that the interval mˆp ± z√mˆpˆq makes too manymistakes; i.e. the probability it will be correct is smaller than the target associated with the choiceof z.Instead of ad-hoc procedure that does not perform as desired, we directly derive an answerusing the result of Section 1 of this

View Full Document