UW-Madison STAT 371 - Chapter 11 Inference for One Numerical Population - D2411740

Home> Schools> University of Wisconsin, Madison> Statistics (STAT) > STAT 371> Chapter 11 Inference for One Numerical Population

DOC PREVIEW

UW-Madison STAT 371 - Chapter 11 Inference for One Numerical Population

School name University of Wisconsin, Madison

Course Stat 371- Intro to Statistics

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Chapter 11Inference for One Numerical Population11.1 CountingSuppose that we can observe i.i.d. random variables,X1, X2, X3, . . . , Xnthat are count variables; i.e. the possible values of each random variable are the integers. . . , −3, −2, −1, 0, 1, 2, 3, . . .or some subset of the integers.We have studied this problem: for BTs (giving the binomial); and for the Poisson. In thissection we consider the general problem.For example, consider the population of students at UW-Madison this semester with a responseequal to the total number of credits that will be completed. Personally, I would not be willing tostudy this as either the binomial or the Poisson.The general problem is as follows. The probability distribution of X1is given by a collectionof equations:P (X1= j) = pj, for j = . . . , −3, −2, −1, 0, 1, 2, 3, . . . .The ideal situation would be when we know all of the pj’s, for then we could compute the proba-bility of any event. But the ideal is not realistic in science.The next best would be to have a parametric family such as the Poisson of binomial. Inthese cases all we need to do is estimate one parameter (or more, sometimes a family has morethan one parameter) and then we would have estimates of all the pj’s. This is a fruitful areathat we cannot pursue in this course because of time limitations. In addition to the binomialand Poisson, parametric families include: the geometric, the hypergeometric and the negativebinomial distributions.Instead, we opt for a much more modest goal. We will use our data to draw inferences aboutthe mean µ of the probability distribution. Just as for the binomial and Poisson, you can visualizeµ as follows: Given the pj’s we can draw a probability histogram; the center of gravity of theprobability histogram is the mean of the population.127We will compute the mean and standard deviation of our data, denoted by ¯x and s, as in Chapter10. We will begin with estimation.11.1.1 Estimation of µOur point estimate of µ, the mean of the population, is simply ¯x, the mean of our data. But, ofcourse, we want to have a confidence interval estimate too.It turns out that without a parametric family to guide us, an exact answer is impossible; wehave no choice but to use an approximate method. In order to obtain an approximate CI, we needto be able to compute approximate probabilities for¯X. Fortunately, there is a wonderful result inprobability theory that can help us. It is called the Central Limit Theorem (CLT).Let’s examine this three word name. Theorem, of course, means it is an important mathematicalfact. Limit means that the truth of the theorem is achieved only as n grows without bound. In otherwords, for any finite value of n the result of the theorem is only an approximation. (Here is anexample that you might have seen in calculus. As n tends to infinity, the value 1/n converges to 0in the limit. For any finite n the limit, 0, can be viewed as an approximation to 1/n.) The qualityof the approximation we obtain from the CLT is an important and vexing issue. Finally, it is calledCentral because it is viewed as very important, i.e. central, to all of probability theory.First, we need a bit more notation. We are denoting the mean of the probability histogram byµ. The probability histogram will also have a standard deviation and we denote it by σ. Let mestate a rather obvious scientific reality. We don’t know the value of µ; indeed, we are collectingdata in order to estimate µ. Because we don’t know the value of µ we also do not know the valueof σ.There are two parts to the CLT. The first tells us how to standardize¯X. If we letZ =¯X − µσ/√n(11.1)then Z is the standardized version of¯X. The second part of the CLT addresses the following issue.I want to calculate probabilities for Z, but I don’t know how to do this. I decide to use the sncto obtain approximate probabilities for Z. The CLT shows that in the limit, as n grows withoutbound, any such approximate probability will converge to the true (unknown) probability. This isa fairly amazing result! It does not matter what the original population looks like, in the limit thesnc gives correct probabilities for Z.Of course, in practice, it is never the case that ‘n grows without bound;’ we have a fixed n andwe use it. But if n is ‘large’ we can hope that the snc gives good approximations. Ideally, this‘hope’ will be verified with some computer simulations and we will do that on occasion.If we expand Z in Equation 11.1 we get the following result, which, while scientifically useless,will lead us to better things.Approximate Confidence Interval for µ when σ is known. The CI is¯x ± z(σ/√n). (11.2)128In the lecture examples, I will tell you about the Cat population. Nature does not tell us the meanof the population, but for some reason Nature tells us that σ = 0.8. Thus, the 95% CI for µ with σknown is:¯x ± z(0.8 /√n).Also in lecture I will give you results of a simulation study to see how well this formula performs.In science, σ is always unknown. We need a way to deal with this. The obvious idea works;replace the unknown σ with the s computed from the data. More precisely, define Z′as follows.Z′=¯X −µs/√n(11.3)According to the work of Slutsky, the CLT conclusion for Z is also true for Z′; i.e. we can use thesnc for Z′too. This leads to our second CI formula, which we will call Slutsky’s CI for µ:Slutsky’s Approximate Confidence Interval for µ when σ is unknown. The CI is¯x ± z(s/√n). (11.4)Playing the role of Nature I put the Cat Population into my computer. Then, switching roles tothe Researcher, I selected a random sample of size n = 10 . I obtained the following data:2, 1, 2, 2, 1, 2, 1, 1, 3, 2.These data yield ¯x = 1.70 and s = 0.675. Thus, for these data, Slutsky’s 95% CI for µ is1.70 ± 1.96(0.675/√10) = 1.70 ± 0.42 = [1.28, 2.12].Reverting to Nature, I note that, in fact, µ = 1 .40. Thus, the point estimate is incorrect, but the95% CI is correct.As you will see in a lecture example, Slutsky’s CI does not perform well if n is small. This wasunderstood in the early 1900’s and led to the work of William Sealy Gosset, who published hisresults under the name Student. In particular, Slutsky’s CI does not perform well in the sense thatit makes too many errors. If you specify that you want 95% confidence, then, for small n, Slutskywill, in fact, give you less than 95% correct intervals, in the

View Full Document