Unformatted text preview:

Conference Reviewing Considered Harmful Thomas Anderson Department of Computer Science Engineering University of Washington ABSTRACT This paper develops a model of computer systems research to help prospective authors understand the often obscure workings of conference program committees We present data to show that the variability between reviewers is often the dominant factor as to whether a paper is accepted We argue that paper merit is likely to be zipf distributed making it inherently difficult for program committees to distinguish between most papers We use game theory to show that with noisy reviews and zipf merit authors have an incentive to submit papers too early and too often These factors make conference reviewing and systems research as a whole less efficient and less effective We describe some recent changes in conference design to address these issues and we suggest some further potential improvements 1 INTRODUCTION Peer to peer systems have become a popular area of research over the past few years reflecting the potential of these systems to provide scalable performance and a high degree of robustness for a variety of applications 5 16 22 23 24 This line of research has resulted in substantial progress in understanding system behavior For example workloads churn and available resources are all heavy tailed and this is fundamental to understanding aggregate system behavior in practice 6 19 Modeling peers as rational sometimes altruistic and occasionally byzantine agents 1 is essential to building systems that are both more robust and more efficient 18 16 And randomness is widely used in peer topeer systems to improve robustness 22 6 In this paper we turn our attention to another peer to peer system that has received less attention from the systems research community the systems research community itself Our approach is somewhat tongue in cheek but we observe many similarities at least on the surface between peer to peer systems and the systems research community For one they both lack central control Progress occurs through the mostly independent actions of individual researchers interacting primarily through the conference publication system 1 Citations and in all likelihood research reputations as well are heavy tailed 20 As any program committee knows all too well authors are often rational sometimes altruistic and 1 In computer systems research peer reviewed conferences rather than journals are the primary way that research results are disseminated occasionally byzantine 10 And while randomness in conference reviewing is undesirable some have suggested that it may dominate many decisions in practice 11 We use concepts from peer to peer systems to develop a model of computer systems research conferences In our experience many students and even faculty find decisions made by conference program committees to be well inscrutable 21 Speaking as someone who has both authored many papers and served on many program committees the feeling is mutual authors often think reviewers are random or biased reviewers often worry authors are intentionally gaming the system Both are right Our thesis is that conference reviewing as it is currently practiced today is harmful in two ways Conference program committees spend an enormous amount of time on what ends up for many papers being close to a random throw of the dice Worse conference reviewing encourages misdirected effort by the research community that slows down research progress By illuminating these issues we hope to blunt their impact We also make some suggestions to better align author and conference incentives In devising solutions however we urge caution seemingly intuitive changes to regulatory mechanisms often yield the opposite of the intended effect We give an example of one such pitfall below Our model has three parts taken directly from the peer to peer literature randomness heavy tailed distributions and incentives We discuss these in turn concluding with a discussion of possible remedies Since each of the elements of our model has been observed before with respect to research publications we focus most of our discussion on the interplay between these elements 2 RANDOMNESS The task facing a technical conference program committees is easier said than done under tight time constraints select a small number of meritorious papers from among a much larger number of submissions Authors would like a predictable and correct outcome and they become legitimately upset when their papers are declined while obviously worse papers are accepted While one might ascribe author complaints to the Lake Wobegon effect everyone believes their own paper is above average 10 authors with multiple submitted papers have a unique perspective did the PC ranking match their own Often the answer is no How can this be In computer systems research individ 5 5 OSDI 4 Overall Merit Overall Merit 4 3 2 1 0 0 20 40 60 80 100 Paper ID 120 140 2 0 160 0 20 60 40 80 Paper ID 100 8 Anon 120 140 NSDI 7 5 6 4 Overall Merit Overall Merit 3 1 6 3 2 5 4 3 2 1 0 SOSP 1 0 20 40 60 Paper ID 80 100 120 0 0 20 40 60 80 100 Paper ID 120 140 160 180 Figure 1 Mean evaluation score with standard deviation for each paper submitted to four recent systems research conferences Papers are sorted by mean review score ual reviewers differ significantly on the very fundamental issue of what is merit how much to weight various factors such as likely future impact importance of the topic uniqueness and creativity of the solution thoroughness of the evaluation and effectiveness of the presentation 2 21 Some reviewers penalize good ideas that are incompletely evaluated as a spur to encouraging authors to complete their work prior to publication others do the opposite as a way to foster follow up work by others that may fill in the blanks Some reviewers are willing to accept papers that take a technical approach that they personally disagree with as long as the evaluation is reasonable others believe a program committee should attempt to prevent bad ideas from being disseminated Even if reviewers could somehow agree on all these factors the larger the program committee the harder it is to apply a uniform standard to all papers under review Systems research conferences have seen a rapid increase in the number of papers submitted Some have suggested charging authors per submission 7 as a way of reducing the flood However the rate of production of scientific research papers has


View Full Document

UW-Madison CS 739 - Conference Reviewing Considered Harmful

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Conference Reviewing Considered Harmful and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Conference Reviewing Considered Harmful and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?