Can the NCAA Basketball Tournament Seeding be Used to Predict Margin ofVictory?Tyler Smith; Neil C. SchwertmanThe American Statistician, Vol. 53, No. 2. (May, 1999), pp. 94-98.Stable URL:http://links.jstor.org/sici?sici=0003-1305%28199905%2953%3A2%3C94%3ACTNBTS%3E2.0.CO%3B2-YThe American Statistician is currently published by American Statistical Association.Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available athttp://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtainedprior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content inthe JSTOR archive only for your personal, non-commercial use.Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/journals/astata.html.Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printedpage of such transmission.The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academicjournals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers,and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community takeadvantage of advances in technology. For more information regarding JSTOR, please contact [email protected]://www.jstor.orgWed Jan 2 12:49:02 2008Can the NCAA Basketball Tournament Seeding be Used to Tyler SMITHand Neil C. SCHWERTMAN Following the announcement by the NCAA of the seeding and placement of men's basketball teams in the regional tournaments there is often much discussion among basket- ball afficionados of the fairness. A statistical analysis of simple regression models for the tournament games shows that indeed there is a strong association between the seed positions of the teams and the actual margin of victory; in fact, fairly reliable prediction models of actual margin of victory in tournament games can be achieved based primar- ily on the seed numbers alone. KEY WORDS: Nonlinear effects; Press; Regression. 1. INTRODUCTION One of the most popular and publicized collegiate compe- titions is the NCAA Men's Basketball tournament, which many call "March Madness." Several previous studies of this tournament by Schwertman, McCready, and Howard (1991); Schwertman, Schenk, and Holbrook (1996); and Carlin (1996) have focused on models for predicting the probability of each seed winning the regional tournament. Carlin (1996) used very basic regression models to predict probability of winning using seed positions and computer rankings, such as the SAGARIN ratings, as the indepen- dent variables. In this article, however, rather than focusing on the probability of winning a contest, we concentrate on building somewhat more complex regression models using the information provided by seed positions for accurately predicting the actual margin of victory. In this study of the NCAA regional men's basketball tournament, our primary objective is to determine, using no more than second order regression models, if (I) seed positions alone provide sufficient information to accurately predict margin of victory; (2) the difference in seed posi- tions is sufficient to provide accurate predictions of margin of victory; and (3) there is evidence that the NCAA selec- tion committee does a good job of seeding the tournament. To organize the NCAA men's basketball tournament, each spring the NCAA selection committee designs four regional 16-team single-elimination basketball tournaments for Division 1-A. The committee not only selects the teams (although certain conference and conference tournament Tyler Smith is Statistician/Data Analyst, Department of Statistics, Uni- versity of Kentucky, Lexington, KY 40506. Neil C. Schwertman is Pro- fessor of Statistics, Department of Mathematics and Statistics, California State University, Chico, CA 95929-0525. Tlze American Statisticiarz, May 1999, Vol. 53, No. 2 Predict Margin of Victory? champions are included automatically), but also seeds the teams based upon a consensus of team strength. The com- mittee attempts to have the corresponding seeds across re- gional tournaments approximately equal in strength. The format for each regional tournament is given in Fig- ure 1, where the number 1 seed (strongest team) plays the number 16 seed (weakest team), the number 2 seed (second strongest) plays the number 15 seed (second weakest), and so on. In the first round, at least, the stronger teams (lower seed numbers) have a definite advantage as they are paired with the weaker teams (higher seed numbers). If there are no upsets this same advantage occurs in the second and sub- sequent rounds. Of course, upsets usually occur and may result in a middle-level team playing a much weaker team than the top seed is playing. Incidentally, in the 56 regional tournaments using the 16-team format, only during the first year (1985) eastern regional were there no upsets (i.e., the lower seed number defeated the higher seed number in all 15 games). 2. THE MODELS Because simple models for predicting the dependent vari- able, margin of victory, are more understandable and pro- vide a clearer picture of the relationship between the de- pendent and independent variables, only basic regression models of no more than second order were considered. The dependent variable for our models is the actual margin of victory for the team with lower seed number, with a nega- tive sign if an upset occurs-that is, if the team with higher seed number wins. Our goal is to use only seed positions, or functions of seed positions, in linear regression models to predict the margin of victory. Therefore, the independent variables for the models are yearly trend and the lower and higher seed numbers, zl and z2,respectively, or functions of x1 and 22. One function of x1 and 22 which Schwertman, McCready, and Howard (1991) and Schwertman, Schenk, and Holbrook (1996) found quite effective in predicting the winner in each contest is a nonlinear transformation of seed number based on the normal
View Full Document