Unformatted text preview:

PSYCHOLOGICAL SCIENCESpecial SectionNEEDED: A BAN ON THE SIGNIFICANCE TESTJohn E. HunterMichigan State UniversityAbstract—The significance test as currently used is a disaster.Whereas most researchers falsely believe thai the significance test hasan error rale of 5%, empirical studies show the average error raleacross psychology is 60%—)2 times higher than researchers think itto he. The error rate for inference using the significance test is greaterthan the error rate using a coin toss to replace the empirical study.The significance test has devastated the research review process.Comprehensive reviews cite conflicting results on almost every issue.Yet quantitatively accurate review of the same results shows that theapparent conflicts stem almost entirely from the high error rate for thesignificance test. If 60% of studies falsely interpret their primaryresults, then reviewers who base their reviews on the interpreted study' findings'' will have a 100% error rate in concluding that there isconflict between study results.Consider a parable; A new young science has addressed significantproblems and has attracted a large number of bright, tireless, empiricalscientists. There is only one problem. This new science uses a defec-tive decision-making technique that has been shown to have a 60%error rale. Worse yet, people in that science falsely believe that theerror rate is 5%. Thus, people take the results of the decision-makingdevice to be almost always right even though it is actually wrong tnoreoften than it is right.The question is this; Do you expect a high or a low rate of progressin this new science? Before you answer, consider this ugly fact. Sup-pose two researchers in this new science are testing a new hypothesis.One does an empirical study and the other flips a coin. The personwho does the study has a 60% chance of error and thus has a 40%probability of being right. The person who flips a coin has a 50%probability of being right. That is, the person who flips a coin will beright more often than the person who does a study.My personal prediction is that progress in this new science willcome at a glacial rate.THE FACTThe new science in this parable is psychology, and the defectivedecision-making technique is the conventional significance test as it iscurrently used. That is, the decision-making technique that dominatestoday's journals has a 60% error rate.The significance test as it is currently used in the social sciencesjust does not work. The significance test is a disa.ster and has beenfrom the beginning. Whereas the typical researcher falsely believesAddress correspondence to John E. Hunter, Department of Psychology,133 Snyder Hall, Michigan State University, East Lansing, MI 48824; e-mail;06991jeh@msu,edu.that the significance test has an error rate of 5%, empirical studiesshow the average error rate across the field of psychology to be 60%(Sedlmeier & Gigerenzer, 1989). That is, the error rate for the sig-nificance test is 12 times higher than researchers think it to be. Theerror rate for inference using the significance test is greater than theerror rate using a coin toss to replace the empirical .study.The current way of using the significance test has been a disasterfor the research review process in psychology. Almost every compre-hensive review in psychology cites conflicting results. Yet quantita-tively accurate review of research results shows that the apparent"conflict" in results stems almost entirely from the high error rate forthe significance test. If key results in 60% of studies are interpretedfalsely, then reviewers who base their reviews on the study "find-ings" as interpreted using the significance test will have a 100% errorrate in forming opinions as to the level of agreement between studyresults.Frank Schmidt, Robert Rodgers, and 1 have documented case aftercase in which the use of the significance test has caused false reviewsof the literature in industrial psychology, in organizational psychol-ogy, and in management. Similar documentation can be found inalmost every meta-analysis now published, though few authors makethis point in reporting their meta-analyses.An otherwise well done review that falsely cites "conflicting find-ings" can delay progress in an area of research for decades. In per-sonnel selection, the use of the significance test has caused a ."iO-yeardelay in progress in some research areas!REACTIONSMost psychologists initially reject my argument as "too radical."They cannot believe that there can be a fatal flaw in any technique thatis so widely used now and ha.s been so widely used for 50 years. Thatis, most scientists know that any technique with a 60% error rate musthe abandoned, and so most current psychologists think that there mustbe some flaw in my argument.The reactions to the statement that the significance test has a 60%error rate fall into four main categories;• "This is no surprise. Everyone knows that far more than 60% ofstudies done in psychology are garbage studies with major meth-odological errors. However, my studies are methodologically topnotch and so my studies do not make errors,"'• "A 60% error rate is impossible; the error rate is only 5%,"• "Low power can't be that bad! It's true that when the treatment hasan effect, a study may have less than 95% power to detect theeffect. But the significance test does have only a 59r error if theVOL. 8, NO. 1, JANUARY 1997Copyright © 1997 American Psychological SocietyPSYCHOLOGICAL SCIENCENeeded: A Bannull hypothesis is true, and in fact the null hypothesis is almostalways true. So the error rate cannot possibly be as high as 60%.""You're right, but there is nothing we can do about it. I know thatthe significance test has a 60% error rate, but I'm not going to tellan editor that. I include significance tests in all my papers becausethey'll get rejected if I don't"Let me respond to each reaction in tum.THE MYTH OF THE GARBAGE STUDYMany laboratory researchers respond to my argument by sayingthat the significance test will have a high error rate only if the studyis methodologically flawed. If the study is methodologically sound, itcannot have an error in results, and therefore the significance test wiilalways come out right.The question is, why are there conflicting results in the literature?Their answer is that many studies are "garbage studies," studies withfundamental design flaws. The key to knowledge is to determinewhich studies are garbage studies and which


View Full Document

O-K-State PSYC 5314 - Research Paper

Download Research Paper
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Research Paper and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Research Paper 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?