1Hypothesis TestingLIR 832Lecture #3Sept, 2008Topics of the Day A. Our Fundamental Problem Again: Learning About Populations from Samples B. Basic Hypothesis Testing: One Tailed Tests Using a Z Statistic C. Probability and Critical Cutoff Approaches: Really the Same Thing D. How do we do hypothesis tests on small samples (n = 30 or less).μμ≥≤−25 6or E. How do we do hypothesis testing when we have information on population standard deviation? On sample standard deviations? F. How do we test a statement such as (two tailed test)? G. How do we test for differences in means of two populations?X=μ2Hypothesis Testing Fundamental Problem: We want to know about a population which is not observed and want to use a sample to learn about the population. Our problem is that sampling variability makes sample an inexact estimator of the population. We need to come up ith th d t l b t th l ti fwith a method to learn about the population from samples which allows for sampling variability.Hypothesis Testing: Example We are considering implementing a training program which purports to improve quality and reduce the number of defects. Currently, 10 out of every 1000 parts produced are not within spec. The standard deviation of defects is 8 parts (variance is 64). The typical employee produces 1,000 parts per day The program costs $1,000 per employee and we have 10,000 production employees. We are unwilling to spend $10,000,000 for a pig in the poke. Instead we decide to run a pilot on 100 employees to determine whether the program is effective for our employees. The firm that does the training will do the program for free, so our only cost is lost production for the time during which employees are trained. Note that the 100 employees are a pilot or, in our terminology, a sample.3Hypothesis Testing: Example Employees are sent to the program and then given several days under instruction to apply what they have learned to their work. We run a one day test on the employees and find that they average 8 parts per thousand. Defects are down, but is this really an improvement ,ypor is it simply the result of sampling variation? Could we be reasonably certain if we trained a second group of employees, or ran this same test next week, that defects would also be down?Hypothesis Testing: Example Abstractly, we are faced with the problem of distinguishing whether the program is effective or whether the improvement is reasonably explained by sampling variation (aka luck).(aka luck).4Hypothesis Testing: Example Let’s approach this as a statistical problem. We know that historically there have been 10 defects per 1000 with standard deviation of 8. So our question is, “How likely is it that we have pulled a sample of 100 employees with a mean defect rate of 8 if i f t th t i i did t k (i8 if, in fact, the training program did not work (in other words, that the population rate of defects remains 10 per 1000)?Hypothesis Testing: Example You can set this problem up as you have been already:Px orPxPxnPzPz()(| )(//)(/.)(.)<<≥=−<−=<−=<−881081081002825μμσ As the sample is larger than 30, we can use our z-table. The P(z<2.5) = 0.62%, very small. There is so small a likelihood that the sample we have observed was drawn from a population with a mean of 10 that we reject the possibility that the training program was ineffective.5Hypothesis Testing: Example What do I mean when I say that the probability is 0.62%?Two possible interpretations: 1. Suppose we set up a population with a mean of 10 and a standard deviation of 8 and draw samples of 100 from that population. Now imagine repeating this experiment 1000 times. We would expect that slightly over 6 of those samples would have a value of 8 or less.2 Alternatively if the population had a mean of 10 standard2. Alternatively, if the population had a mean of 10, standard deviation of 8, we would expect that 0.62% of the time we would draw a sample values 8 or less by chance. I find the first approach makes it easier to understand what I mean when I say there is a 0.62% probability that an event occurred by chance, but you may prefer the second. Hypothesis Testing: Example Thinking about this graphically: Two extreme possibilities:z 1. Training did little or nothingz 2. The training was highly effective (really useful in Thomas the Tank Engine terminology)6Hypothesis Testing: Example If the training did nothing, we would expect the distribution of defects post training to look a lot like the distribution of defects prior to training.Hypothesis Testing: ExampleS th ld di t ib ti t d d i t hi h t j d lSo we can use the old distribution as a standard against which to judge our sample results. If the sample looks a lot like the old distribution it would be reasonable to believe that the training did not work.1. For convenience, we add cut points at ± 2 standard deviations (1.6 = 2* .8)7Hypothesis Testing: Example When we pull our sample, we calculate a mean and compare it to the old distribution. So, for example, our sample returns a value of 8. This lies to the left of the low cut point and we conclude it is very unlikely that the training did nothing.Hypothesis Testing: Example Let’s go behind the graphs to the underlying, if unobserved population.8Hypothesis Testing: ExampleHypothesis Testing: Example9Hypothesis Testing: ExampleHypothesis Testing: Formalizing the Steps Step 1: State our beliefs about the world clearly (hypotheses). Example: The consulting firms contention is that their training program reduces defect rates. Another possibility is that the training program is py gpgineffective and that any changes in defect rates are simply the result of sampling variability (randomness)10Hypothesis Testing: Formalizing the Steps Step 2: Formalize this into the alternative and null hypothesis: Alternative: HAμpost training< 10: the the training program is effectiveNull: HOμtt i i≥10: the trainingNull: HOμpost training≥10: the training program has no effect on defect rates.Hypothesis Testing: Formalizing the Steps More about Step 2: The two hypotheses are about the unobserved population. We will use samples to test these two hypotheses Together, the null and alternative cover all possible outcomes. The null is about change being the result of sampling i bili h l f i diff Svariability, not the result of
View Full Document