DOC PREVIEW
MSU CSE 802 - Duin_ComparingClassifiers

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ELSEVIER Pattern Recognition Letters 17 (! 996) 529-536 Pattern Recognition Letters A note on comparing classifiers 1 Robert P.W. Duin * Pattern Recognition Group, Faculty of Applied Physics, Delft University of Technology, P.O. Box 5046, 2600 GA Delft, Netherlands Received 23 June 1995; revised 22 November 1995 Abstract Recently many new classifiers have been proposed, mainly based on neural network techniques. Comparisons are needed to evaluate the performance of the new methods. It is argued that a straightforward fair comparison demands automatic classifiers with no user interaction. As this conflicts with one of the main characteristics of neural networks, their flexibility, the question whether they are better or worse than traditional techniques might be undecidable. Keywords: Automatic classifiers; Benchmarking; Comparisons; Feedforward neural networks 1. Introduction The large interest in artificial neural networks (ANN) during the last ten years has produced a number of interesting pattern recognition applica- tions with sometimes surprisingly good results (Cheng and Titterington, 1994; Golomb et al., 1991; Kamata et al., 1992; Michie et al., 1994; Prechelt, 1994; Ripley, 1994; Sabourin and Drouhard, 1992; Schmidt, 1994; Sejnowski and Rosenberg, 1988). This at least suggests that neural networks are good for building pattern classifiers. It has to be regretted that originally almost never comparisons were made * Temporarily with the Vision, Speech and Signal Processing Group, Dept. of Electrical Engineering, Univ. of Surrey, United Kingdom. Email: [email protected] This paper is based on a presentation at the ICMS Workshop on Statistics and Neural Networks, Edinburgh, April 19-20, 1995, titled "The Feedforward Network as an Automatic Classifier". The experiments presented here are new and more extensive. with traditional techniques like the nearest neighbor rule (NNR). More recently, the interest in such com- parisons has grown, e.g. see (Ripley, 1994; Prechelt, 1994). A study by Schmidt et al. (1994) showed that the NNR is equivalent or better in a number of applications, including NETtalk, the text-to-speech recognition problem studied by Sejnowski and Rosenberg (1988) that originally caused much enthu- siasm for the neural network possibilities. A very broad comparison study was organized in the STATLOG project (Michie et al., 1994) using 22 real-world datasets and 23 different classifiers in- cluding a number based on ANN. This made clear that there is no unique best classifier. Several meth- ods showed a good performance over a wide range of databases, among which the NNR and the feedfor- ward neural network. The question whether neural nets can outperform traditional techniques remains an intriguing one for many researchers. It is the topic of several discus- sions on the Internet, the theme of workshops and competitions (ICMS Workshop, 1995) and it enters 0167-8655/96/$12.00 © 1996 Elsevier Science B.V. All rights reserved SSDI 0167-8655(95)001 13-1530 R.P.W. Duin / Pattern Recognition Letters 17 (1996) 529-536 the discussion sections of many research papers. It is the goal of this paper to discuss this question itself. What do we mean by: "Can neural networks outper- form traditional techniques?". How might such a question be answered? What are the pitfalls and traps that should be avoided? We realize that for some researchers our conclusions might not be that surpris- ing. Given the large set of publications, however, in which the authors struggle with the problem of pre- senting fair comparisons it is clear that it deserves more attention. 2. Comparison problems At a first glance it may seem that comparing classifiers is as easy as error counting. There is, however, certainly more to say. Even if we skip the issue of computing time then still two other factors are of importance: For what application(s) will the classifier be used and by whom? It is perfectly clear that performance differences are a function of class distributions and sample sizes and therefore of the application. If one tries to draw conclusions that are application independent and thus distribution free, then only performance bounds can be obtained, e.g. in terms of the Vapnik- Chervonenkis complexity of classifiers (Vapnik, 1982; Devroye, 1988). This is not the issue here. We are interested in the real performance for practical applications. Therefore, an application domain has to be defined. The traditional way to do this is by a diverse collection of datasets. In studying the results, however, one should keep in mind that such a collec- tion does not represent any reality. It is an arbitrary collection, at most showing partially the diversity, but certainly not with any representative weight. It appears still possible that for classifiers showing a consistently bad behavior in the problem collection, somewhere an application exists for which they are perfectly suited. A second issue, causing even more problems, is the dependence of the performance of some classi- fiers on the skill of the analyst who applies them. We will elaborate on that in the following. Some classifiers are very flexible, with many user-adjustable parameters, others are almost entirely automatic. There will be hardly any discussion on the performance of Fisher's linear discriminant or the 1-NN rule (for a given metric) on a particular dataset. However, if somebody states that he uses an ANN based classifier, or even more specific, a multi-layer perceptron using the backpropagation rule, then there is still a wide range of possibilities. His results will be highly dependent on the architec- ture, the initialization procedure, his strategy of de- termining targets, step sizes, momentum terms and the use of weight decay. There is not such a thing as a uniquely defined neural network classifier. Soft- ware packages and textbooks give guidelines on how to establish the parameters for a particular applica- tion. The result and its performance will be user dependent. Some


View Full Document

MSU CSE 802 - Duin_ComparingClassifiers

Download Duin_ComparingClassifiers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Duin_ComparingClassifiers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Duin_ComparingClassifiers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?