Penn STAT 550 - Hypothesis Testing Basic Setup

Unformatted text preview:

Statistics 550 Notes 20 Reading: Section 4.1-4.2I. Hypothesis Testing Basic Setup (Chapter 4.1)Review Motivating Example: A graphologist claims to be able to identify the writing of a schizophrenic person from a nonschizophrenic person. The graphologist is given a set of 10 folders, each containing handwriting samples of two persons, one nonschizophrenic and the other schizophrenic.In an experiment, the graphologist made 6 correct identifications. Is there strong evidence that the graphologist is able to better identify the writing of schizophrenics better than a person who was randomly guessing would?Probability model: Let pbe the probability that the graphologist successfully identifies the writing of a randomly chosen schizophrenic vs. nonschizophrenic person. A reasonable model is that the 10 trials are iid Bernoulli with probability of success p.Hypotheses: 01: 0.5: 0.5 H pH p=>The alternative/research hypothesis 1H should be the hypothesis we’re trying to establish.1Test Statistic: 1( , , )nW X XK that is used to decide whetherto accept or reject 0H. In the motivating example, a natural test statistic is101 101( , , )iiW X X X==�K, the number of successful identifications of schizophrenics the graphologist makes. The observed value of this test statistic is 6.Critical region: Region of values Cof the test statistic for which we reject the null hypothesis, e.g., {6,7,8,9,10}C =.Errors in Hypothesis Testing: True State of NatureDecision0His true1His trueReject 0HType I error Correct decisionAccept (retain) 0HCorrect decision Type II errorThe best critical region would make the probability of a Type I error small when 0His true and the probability of a Type II error small when 1His true. But in general there is a tradeoff between these two types of errors. Size of a test: We say that a test with test statistic1( , , )nW X XKand critical region C is of size aif 01max [ ( , , ) ]nP W X X Cqq wa�= �K.2The size of the test is the maximum probability (where the maximum is taken over all q that are part of the null hypothesis; for the motivating example, the null hypothesis only has one value of q in it) of making a Type I error when the null hypothesis is true. A test with size a�is said to be a test of (significance) level a.Suppose we use a critical region {6,7,8,9,10}C =with the test statistic 101 101( , , )iiW X X X==�Kfor the motivating example. The size of the test is 0.5( 6) 0.377pP Y=� =whereY has a binomial distribution with n=10 and probability p=0.5. Power: The power of a test at an alternative 1q w�is the probability of making a correct decision when qis the true parameter (i.e., the probability of not making a Type II error when qis the true parameter). The power of the test with test statistic101 101( , , )iiW X X X==�K and critical region{6,7,8,9,10}C = at p=0.6 is 0.6( 6) 0.633pP Y=� =and at p=0.7 is 0.7( 7) 0.850pP Y=� =where Y has a binomial distribution with n=10 and probability p. The power depends on the specific parameter in the alternative hypothesis that is being considered. Power function: 1 1( ) [ ( , , ) ]; C nP W X X Cqg q q w= � �K.3Neyman-Pearson paradigm: Set the size of the test to be at most some small level, typically 0.10, 0.05 or 0.01 (most commonly 0.05) in order to protect against Type I errors. Then among tests that have this size, choose the one that has the “best” power function. In Chapter 4.2, we will define more precisely what we mean by “best” power function and derive optimal tests for certain situations.For the test statistic 101 101( , , )iiW X X X==�K, the critical region {6,7,8,9,10}C = has a size of 0.377; this gives too high a probability of Type I error. The critical region{8,9,10}C = has a size of 0.0547, which makes the probability of a Type I error reasonably small. Using{8,9,10}C =, we retain the null hypothesis for the actual experiment for which W was equal to 6. P-value: For a test statistic 1( , , )nW X XK, consider a family of critical regions { : }CeeΡeach with different sizes. For the observed value of the test statistic obsW from the sample, consider the subset of critical regions for whichwe would reject the null hypothesis, { : }obsC W Ce e�. The p-value is the minimum size of the tests in the subset{ : }obsC W Ce e�, p-value = { : }min Size(test with critical region )obsC W CCe e��.The p-value is the minimum significance level for which we would still reject the null hypothesis. 4The p-value is a measure of how much evidence there is against the null hypothesis; it is the minimum significance level for which we would still reject the null hypothesis.Consider the family of critical regions 101{ }i iiC X i== ��forthe motivating example. Since the graphologist made 6 correct identifications, we reject the null hypothesis for critical regions , 6iC i �. The minimum size of the critical regions , 6iC i � is for i=6 and equals 0.377. The p-values is thus 0.377. Scale of evidencep-value Evidence<0.01 very strong evidence against the null hypothesis0.01-0.05 Strong evidence against the null hypothesis0.05-0.10 weak evidence against the null hypothesis>0.1 little or no evidence against the null hypothesisWarnings:(1) A large p-value is not strong evidence in favor of 0H. A large p-value can occur for two reasons: (i) 0H is true or (ii) 0H is false but the test has low power at the true alternative. 5(2) Do not confuse the p-value with 0( | Data)P H. The p-value is not the probability that the null hypothesis is true.II. Testing simple versus simple hypotheses: Bayes proceduresConsider testing a simple null hypothesis 0 0:H q q= versus a simple alternative hypothesis 1 1:H q q=, i.e., under the null hypothesis 0 0~ ( )P Pq �X X |and under the alternative hypothesis 1 1~ ( )P Pq �X X |.Example 1: 1, ,nX XKiid ( ,1)N m. 0: 0H m=, 1: 1H m=.Example 2: Xhas one of the following two distributions:P(X=x)0 1 2 3 40P0.1 0.1 0.1 0.2 0.51P0.3 0.3 0.2 0.1 0.1Bayes procedures: Consider 0-1 loss, i.e., the loss is 1 if wechoose the incorrect hypothesis and 0 if we choose the correct hypothesis. Let the prior probabilities be pon 0q and 1 p-on 1q. The posterior probability for 0qis0 000 0 1 1( | ) ( )( | )( | ) ( ) ( | ) ( )PPP Pq p qqq p q q p q=+XXX X.The posterior risk for 0-1 loss is minimized by choosing thehypothesis with higher posterior probability. 6Thus, the Bayes rule is to choose 0H(equivalently 0q) if 0 0


View Full Document

Penn STAT 550 - Hypothesis Testing Basic Setup

Download Hypothesis Testing Basic Setup
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hypothesis Testing Basic Setup and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hypothesis Testing Basic Setup 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?