IJDAR 2002 4 191 204 Performance evaluation of pattern classi ers for handwritten character recognition Cheng Lin Liu Hiroshi Sako Hiromichi Fujisawa Central Research Laboratory Hitachi 1 280 Higashi koigakubo Kokubunji shi Tokyo 185 8601 Japan e mail liucl sakou fujisawa crl hitachi co jp Received July 18 2001 Accepted September 28 2001 Abstract This paper describes a performance evaluation study in which some e cient classi ers are tested in handwritten digit recognition The evaluated classi ers include a statistical classi er modi ed quadratic discriminant function MQDF three neural classi ers and an LVQ learning vector quantization classi er They are e cient in that high accuracies can be achieved at moderate memory space and computation cost The performance is measured in terms of classi cation accuracy sensitivity to training sample size ambiguity rejection and outlier resistance The outlier resistance of neural classi ers is enhanced by training with synthesized outlier data The classi ers are tested on a large data set extracted from NIST SD19 As results the test accuracies of the evaluated classi ers are comparable to or higher than those of the nearest neighbor 1 NN rule and regularized discriminant analysis RDA It is shown that neural classi ers are more susceptible to small sample size than MQDF although they yield higher accuracies on large sample size As a neural classi er the polynomial classi er PC gives the highest accuracy and performs best in ambiguity rejection On the other hand MQDF is superior in outlier rejection even though it is not trained with outlier data The results indicate that pattern classi ers have complementary advantages and they should be appropriately combined to achieve higher performance Keywords Handwritten character recognition Pattern classi cation Outlier rejection Statistical classi ers Neural networks Discriminative learning Handwritten digit recognition 1 Introduction In optical character recognition OCR statistical classi ers and neural networks are prevalently used for classi cation due to their learning exibility and cheap computation Statistical classi ers can be divided into paramet ric classi ers and non parametric classi ers 1 2 Parametric classi ers include the linear discriminant function LDF the quadratic discriminant function QDF the Gaussian mixture classi er etc An improvement to QDF named regularized discriminant analysis RDA was shown to be e ective to overcome inadequate sample size 3 The modi ed quadratic discriminant function MQDF proposed by Kimura et al was shown to improve the accuracy memory and computation e ciency of the QDF 4 5 Non parametric classi ers include the Parzen window classi er the nearest neighbor 1 NN and k NN rules the decision tree the subspace method etc Neural networks for pattern recognition include the multilayer perceptron MLP 6 the radial basis function RBF network 7 the probabilistic neural network PNN 8 the polynomial classi er PC 9 10 etc The LVQ learning vector quantization classi er 11 12 can be viewed as a hybrid since it takes the 1 NN rule for classi cation while the prototypes are designed in discriminative learning as for neural classi ers The recently emerged classi er the support vector machine SVM 13 14 has many unique properties compared to traditional statistical and neural classi ers For character recognition the pattern classi ers are usually used for classi cation based on heuristic feature extraction 15 so that a relatively simple classi er can achieve high accuracy The e ciency of feature extraction and the simplicity of classi cation algorithms are preferable for real time recognition on low cost computers The features frequently used in character recognition include the chaincode feature direction code histogram 4 the K L expansion PCA 16 18 the Gabor transform 19 etc Some techniques were proposed to extract more discriminative features to achieve high accuracy 20 21 Neural networks can also directly work on character bitmaps to perform recognition This scheme needs a specially designed and rather complicated architecture to achieve high performance such as the convolutional neural network 22 For pattern classi cation neural classi ers are generally trained in discriminative learning i e the parame 192 C L Liu et al Performance evaluation of pattern classi ers for handwritten character recognition ters are tuned to separate the examples of di erent classes as much as possible Discriminative learning has the potential to yield high classi cation accuracy but training is time consuming and the generalization performance often su ers from over tting In contrast for statistical classi ers the training data of each class is used separately to build a density model or discriminant function Neural networks can also be built in this philosophy called the relative density approach 23 This approach is possible and usually necessary to t more parameters without degradation of generalization performance The instances of this approach are the subspace method 24 the mixture linear model 23 and the auto associative neural network 25 We can view the statistical classi ers and the relative density approach as density models or generative models as opposed to the discriminative models In character eld recognition especially integrated segmentation recognition ISR 26 28 we are concerned not only with the classi cation accuracy of the underlying classi er but also resistance to outliers In this paper we mean by outliers the patterns out of the classes that we aim to detect and classify In ISR because the characters cannot be segmented reliably prior to classi cation the trial segmentation will generate some intermediate non character patterns The non character patterns are outliers and should be assigned low con dence by the underlying classi er so as to be rejected In this paper we will evaluate the outlier rejection performance as well as the classi cation accuracy of some classi ers The evaluated classi ers include a statistical classi er MQDF three neural classi ers MLP RBF classi er PC and an LVQ classi er We selected these classi ers as objects because they are e cient in the sense that high accuracy can be achieved at moderate memory space and computation cost SVM does not belong to this category because it is very expensive in learning and recognition even though it gives superior accuracy 29 In the test case of handwritten digit recognition we will give the results of classi

