UW-Madison STAT 371 - Chapter 3 - Estimation of p - D1687127

Home> Schools> University of Wisconsin, Madison> Statistics (STAT) > STAT 371> Chapter 3 - Estimation of p

DOC PREVIEW

UW-Madison STAT 371 - Chapter 3 - Estimation of p

School name University of Wisconsin, Madison

Course Stat 371- Intro to Statistics

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Chapter 3Estimation of p3.1 Point and Interval Estimates of pSuppose that we have BT. So far, in every example I have told you the (numerical) value of p.In science, usually the value of p is unknown to the researcher. In such cases, scientists andstatisticians use data from BT to estimate the value of p. Note that the word estimate is a technicalterm that has a precise definition in this course. I don’t particularly like the choice of the wordestimate for what we do, but I am not the tsar of the Statistics world!It will be very convenient for your learning if we distinguish between two creatures. First, isNature, who knows everything and in particular knows the value of p. Second is the researcherwho is ignorant of the value of p.Here is the idea. A researcher plans to observe n BT, but does not know the value of p. Afterthe BT have been observed the researcher will use the information obtained to make a statementabout what p might be.After observing the BT, the researcher counts the number of successes, x, in the n BT. Wedefine ˆp = x/n, the proportion of successes in the sample, to be the point estimate of p.For example, if I observe n = 20 BT and count x = 13 successes, then my point estimate of pis ˆp = 13/20 = 0.65.It is trivially easy to calculate ˆp = x/n; thus, based on your experiences in previous mathcourses, you might expect that we will move along to the next topic. But we won’t.What we do in a Statistics course is evaluate the behavior of our procedure. What does thatmean? Statisticians evaluate procedures by seeing how they perform in the long run.We say that the point estimate ˆp is correct if, and only if, ˆp = p. Obviously, any honestresearcher wants the point estimate to be correct. Let’s go back to the example of a researcher whoobserves 13 successes in 20 BT and calculates ˆp = 13/20 = 0.65.The researcher schedules a press conference and the following exchange is recorded.• Researcher: I know that all Americans are curious about the value of p. I am here today toannounce that based on my incredible effort, wisdom and brilliance, I estimate p to be 0.65.• Reporter: Great, but what is the actual value of p? Are you saying that p = 0.65?31• Researcher: Well, I don’t actually know what p is, but I certainly hope that it equals 0.65.As I have stated many times, nobody is better than I at obtaining correct point estimates.• Reporter: Granted, but is anybody worse than you at obtaining correct point estimates?• Researcher: (Mumbling) Well, no. You see, the problem is that only Nature knows the actualvalue of p. No mere researcher will ever know it.• Reporter: Then why are we here?Before we follow the reporter’s suggestion and give up, let’s see what we can learn.Let’s bring Nature into the analysis. Suppose that Nature knows that p = 0.75. Well, Natureknows that the researcher in the above press conference has an incorrect point estimate. But let’sproceed beyond that one example.Consider a researcher who decides to observe n = 20 BT and use them to estimate p. Whatwill happen?Well, we don’t know what will happen. The researcher might observe x = 15 successes, givingˆp = 15/20 = 0.75 which would be a correct point estimate. Sadly, of course, the researcher wouldnot know it is correct; only Nature would.Given what we were doing in Chapters 1 and 2, it occurs to us to calculate a probability. Afterall, we use probabilities to quantify uncertainty.So, before the researcher observes the 20 BT, Nature decides to calculate the probability thatthe point estimate will be correct. This probability is, of course,P (X = 15) =20!15!5!(0.75)15(0.25)5,which I find, with the help of the binomial website, to be 0.2023. There are two rather obviousundesirable features to this answer.1. Only Nature knows whether the point estimate is correct; indeed, before the data are col-lected, only Nature can calculate the probability the point estimate will be correct.2. The probability that the point estimate will be correct is disappointingly small.(And note that for most values of p, it is impossible for the point estimate to be correct. For one ofcountless possible examples, suppose that n = 20 as in the current discussion and p = 0.43. It isimpossible to obtain ˆp = 0.43.)As we shall see repeatedly in this course, what often happens is that by collecting more dataour procedure becomes ‘better’ in some way. Thus, suppose that the researcher plans to observen = 100 BT, with p still equal to 0.75. The probability that the point estimate will be correct is,P (X = 75) =100!75!25!(0.75)75(0.25)25,which I find, with the help of the website, to be 0.0918. This is very upsetting! More data makesthe probability of a correct point estimate smaller, not larger.The difficulty lies in our desire to have ˆp be exactly correct. Close is good too. In fact, statisti-cians like to say,32Close counts in horse shoes, hand grenades and estimation.But what do I mean by close? Well, for an example to move us along, suppose we decide thatif ˆp is within 0.05 of p then it is close enough for us to be happy. Revisiting the two computationsabove, we see that for n = 20 and p = 0.75, close enough means (14 ≤ X ≤ 16). The probabilityof this happening, again with the help of the website, is 0.5606. For n = 100 close enough means(70 ≤ X ≤ 80). The probability of this happening is 0.7967. As a final example, for n = 1000,close enough means (700 ≤ X ≤ 800). The probability of this happening is 0.9998, a virtualcertainty to a statistician.Here is another way to view my ‘close enough’ argument above. Instead of estimating p by thesingle number (point) ˆp we use an interval estimate, in this example the closed interval is ˆp±0.05.As you may have learned in a math class, a closed interval is an interval the includes its endpoints.In this class, all interval estimates are closed intervals. Analogous to our earlier definition, we saythat the interval estimate is correct if, and only if, the interval contains p. Thus, saying that ˆp iswithin 0.05 of p (my working definition of close enough in the example above) is equivalent tosaying that p is in the interval estimate; i.e. the interval estimate is correct.Henceforth, I will not talk about ˆp being close enough to p; I will talk about whether an intervalestimate is correct. Let’s look at the example above again with this new perspective.For the value p = 0.75 I studied the performance of the interval estimate ˆp

View Full Document