18 466 Dudley March 11 2003 CHAPTER 1 DECISION THEORY AND TESTING SIMPLE HYPOTHESES 1 1 Deciding between two simple hypotheses the Neyman Pearson Lemma Probability theory is reviewed in Appendix D Suppose an experiment has a set X of possible outcomes The outcome has some probability distribution de ned on X In statistics we typically don t know what is but we have hypotheses about what it may be After making observations we ll try to make a decision between or among the hypotheses In general there could be in nitely many possibilities for but to begin with we re going to look at the case where there are just two possibilities P or Q and we need to decide which it is For example a point x in X could give the outcome of a test for a certain disease where P is the distribution of x for those who don t have the disease and Q is the distribution for those who do Often we have n observations independent with distribution Then X can be replaced by the set X n of all ordered n tuples x1 xn of points of X and by the Cartesian product measure of n copies of In this way the case of n observations x1 xn reduces to that of one observation x1 xn The probability measures P and Q are each de ned on some algebra B of subsets of X such as the Borel sets in case X is the real line R or a Euclidean space A test of the hypothesis that P will be given by a measurable set A in other words a set A in B If we observe x in A then we will reject the hypothesis that P in favor of the alternative hypothesis that Q Then P A is called the size of the test A at P The size is the probability that we ll make the error of rejecting P when it s true i e when P sometimes called a Type I error On the other hand Q A is called the power of the test A against the alternative Q The power is the probability that when Q is true the test correctly rejects P and prefers Q The complementary probability 1 Q A is sometimes called the probability of a Type II error Given P and Q for the test A to be as e ective as possible we d like the size to be small and the power to be large In the rest of this section it will be shown how the choice of A can be made optimally Example 1 1 1 Let X R and let P and Q be normal measures both with variance 0 04 P N 0 0 04 and Q N 1 0 04 Larger values of x tend to favor Q so it seems reasonable to take A as a half line c for some c At x 1 2 the densities of P and Q are equal For x 1 2 P has larger density For x 1 2 Q does So if we have no reason in advance to prefer one of P and Q we might take c 1 2 Then the probabilities of the two types of errors are each about 0 0062 from tables of the normal distribution In other words the size is 0 0062 and the power is 0 9938 If the variances had been larger so would the error probabilities It s not always best to prefer the distribution P or Q with larger density at the observation or vector of observations In testing for a disease an error indicating a disease when the subject is actually healthy can lead to further possibly expensive tests or inappropriate treatments On the other hand the error of overlooking a disease when the patient has it could be much more serious depending on the severity of the disease 1 Numerical values called losses will be assigned to the consequences of mistaken decisions Let L be the loss incurred when is true and we decide in favor of A correct decision will be assumed to cause zero loss so LP P LQQ 0 The losses LP Q and LQP will be positive and in general will be di erent Also the statistician may have assigned some probabilities to P or Q in advance called prior probabilities say P 1 Q with 0 P 1 For example it could be known from other data approximately what fraction of people in a population being tested have a disease The part of statistics in which prior probabilities are assumed to exist is known as Bayesian statistics as contrasted with frequentist statistics where priors are not assumed In this book both are treated Later on some pros and cons of the Bayesian and frequentist approaches will be mentioned It will turn out that the best tests between P and Q will be based on the ratio of densities of P and Q called the likelihood ratio de ned as follows In general P or Q could have continuous or discrete parts but P and Q are always absolutely continuous with respect to P Q so that there is a Radon Nikodym derivative RAP 5 5 4 h x dP d P Q x Then dQ d P Q 1 h The likelihood ratio RQ P x of Q to P at x is de ned as 1 h x h x or if h x 0 The likelihood ratio like h is de ned up to equality P Q almost everywhere If P and Q have densities f and g respectively with respect to some measure for example Lebesgue measure on R then we can take RQ P x g x f x if f x 0 or when g x 0 f x or 0 when g x 0 f x For a proof see Appendix A In Example 1 1 1 RQ P x e25 x 0 5 Or let P and Q both be Poisson distributions on the set N of nonnegative integers with P k P k e k k and Q P for some Then RQ P k e k for all k N The sizes 0 05 0 01 and 0 001 were chosen rather arbitrarily in the rst half of the 20th century and used in selecting tests So if a test A has size 0 05 or less at P and the observation x is in A the outcome is called statistically signi cant and the hypothesis P is rejected If 0 001 the outcome is called highly signi cant The levels 0 05 etc are still in wide use in some applied elds such as medicine and psychology although they are no longer very popular among statisticians themselves For discrete distributions not many sizes of tests may be available as in the following Example 1 1 2 Let X 0 1 2 P 0 0 8 P 1 0 05 P 2 0 15 Q 0 0 008 Q 1 0 002 Q 2 0 99 Then RQ P …
View Full Document