Chapter 11 Inference for One Numerical Population 11 1 Counting Suppose that we can observe i i d random variables X1 X2 X3 Xn that are count variables i e the possible values of each random variable are the integers 3 2 1 0 1 2 3 or some subset of the integers We have studied this problem for BTs giving the binomial and for the Poisson In this section we consider the general problem For example consider the population of students at UW Madison this semester with a response equal to the total number of credits that will be completed Personally I would not be willing to study this as either the binomial or the Poisson The general problem is as follows The probability distribution of X1 is given by a collection of equations P X1 j pj for j 3 2 1 0 1 2 3 The ideal situation would be when we know all of the pj s for then we could compute the probability of any event But the ideal is not realistic in science The next best would be to have a parametric family such as the Poisson of binomial In these cases all we need to do is estimate one parameter or more sometimes a family has more than one parameter and then we would have estimates of all the pj s This is a fruitful area that we cannot pursue in this course because of time limitations In addition to the binomial and Poisson parametric families include the geometric the hypergeometric and the negative binomial distributions Instead we opt for a much more modest goal We will use our data to draw inferences about the mean of the probability distribution Just as for the binomial and Poisson you can visualize as follows Given the pj s we can draw a probability histogram the center of gravity of the probability histogram is the mean of the population 127 We will compute the mean and standard deviation of our data denoted by x and s as in Chapter 10 We will begin with estimation 11 1 1 Estimation of Our point estimate of the mean of the population is simply x the mean of our data But of course we want to have a confidence interval estimate too It turns out that without a parametric family to guide us an exact answer is impossible we have no choice but to use an approximate method In order to obtain an approximate CI we need to be able to compute approximate probabilities for X Fortunately there is a wonderful result in probability theory that can help us It is called the Central Limit Theorem CLT Let s examine this three word name Theorem of course means it is an important mathematical fact Limit means that the truth of the theorem is achieved only as n grows without bound In other words for any finite value of n the result of the theorem is only an approximation Here is an example that you might have seen in calculus As n tends to infinity the value 1 n converges to 0 in the limit For any finite n the limit 0 can be viewed as an approximation to 1 n The quality of the approximation we obtain from the CLT is an important and vexing issue Finally it is called Central because it is viewed as very important i e central to all of probability theory First we need a bit more notation We are denoting the mean of the probability histogram by The probability histogram will also have a standard deviation and we denote it by Let me state a rather obvious scientific reality We don t know the value of indeed we are collecting data in order to estimate Because we don t know the value of we also do not know the value of There are two parts to the CLT The first tells us how to standardize X If we let Z X n 11 1 then Z is the standardized version of X The second part of the CLT addresses the following issue I want to calculate probabilities for Z but I don t know how to do this I decide to use the snc to obtain approximate probabilities for Z The CLT shows that in the limit as n grows without bound any such approximate probability will converge to the true unknown probability This is a fairly amazing result It does not matter what the original population looks like in the limit the snc gives correct probabilities for Z Of course in practice it is never the case that n grows without bound we have a fixed n and we use it But if n is large we can hope that the snc gives good approximations Ideally this hope will be verified with some computer simulations and we will do that on occasion If we expand Z in Equation 11 1 we get the following result which while scientifically useless will lead us to better things Approximate Confidence Interval for when is known The CI is x z n 128 11 2 In the lecture examples I will tell you about the Cat population Nature does not tell us the mean of the population but for some reason Nature tells us that 0 8 Thus the 95 CI for with known is x z 0 8 n Also in lecture I will give you results of a simulation study to see how well this formula performs In science is always unknown We need a way to deal with this The obvious idea works replace the unknown with the s computed from the data More precisely define Z as follows Z X s n 11 3 According to the work of Slutsky the CLT conclusion for Z is also true for Z i e we can use the snc for Z too This leads to our second CI formula which we will call Slutsky s CI for Slutsky s Approximate Confidence Interval for when is unknown The CI is x z s n 11 4 Playing the role of Nature I put the Cat Population into my computer Then switching roles to the Researcher I selected a random sample of size n 10 I obtained the following data 2 1 2 2 1 2 1 1 3 2 These data yield x 1 70 and s 0 675 Thus for these data Slutsky s 95 CI for is 1 70 1 96 0 675 10 1 70 0 42 1 28 2 12 Reverting to Nature I note that in fact 1 40 Thus the point estimate is incorrect but the 95 CI is correct As you will see in a lecture example Slutsky s CI does not perform well if n is small This was understood in the early 1900 s and led to the work of William Sealy Gosset who published his results under the name Student In particular Slutsky s CI does not perform well in the sense that it makes too many errors If you specify that you want 95 confidence then for small n Slutsky will in fact give …
View Full Document