AN AUTOMATIC CLASSIFICATION SYSTEM APPLIED IN MEDICAL IMAGES Bo QIU Chang Sheng XU Qi TIAN Institute for Infocomm I2R Singapore 119613 qiubo xucs tian i2r a star edu sg ABSTRACT In this paper a multi class classification system is developed for medical images We have mainly explored ways to use different image features and compared two classifiers Principle Component Analysis PCA and Supporting Vector Machines SVM with RBF radial basis functions kernels Experimental results showed that SVM with a combination of the middle level blob feature and low level features down scaled images and their texture maps achieved the highest recognition accuracy Using the 9000 given training images from ImageCLEF05 our proposed method has achieved a recognition rate of 88 9 in a simulation experiment And according to the evaluation result from the ImageCLEF05 organizer our method has achieved a recognition rate of 82 over its 1000 testing images etc Multi layer Neural Networks Stochastic methods Simulated Annealing Boltzmann learning Evolutionary methods etc The methods above have been applied successfully in many fields 2 But until now the problem of medical images classification is a new and great challenge because when compared with other classification problems there are some particular difficulties in medical images z Great unbalance between classes Figure 1 shows the size of each class in our database see experiment part It can be found that class 6 has more than 500 samples class 12 has more than 2 500 samples class 34 has near 1 000 samples while all the others are much less the minimal class has only 9 samples 20 largest classes occupy near 80 of the whole dataset This unbalance makes many common classification methods unavailable 1 INTRODUCTION With the fast development of modern medical devices more and more medical images are generated so that the demand becomes more and more urgent for automatically indexing comparing analyzing and annotating the huge volume of medical images Medical images are a kind of medical evidence to patients and doctors To interpret those medical evidences generally doctors will use specialist vocabulary and natural language phrases and relate them to some specific cases It is difficult for some unskilled doctors but automatic annotation of medical images will do much help to them For automatic annotation which is a kind of automatic machine based reasoning based on the evidence gathered additional interpretive semantics must be attached to the image data About this some methods have been explored in special domains like the diagnosis of breast cancer 1 But until now in a wider domain there is no popular method for automatic annotation owing to the variety of medical images and the lack of relevant domain knowledge So in this paper we simplify the problem into a multi class classification problem which means that the classification labels assigned to the classes are regarded as a simple annotation According to 2 classification methods include parametric and nonparametric With given training data in this paper only parametric methods are considered which includes Bayesian estimation Maximum Likelihood Hidden Markov models Expectation Maximization Fisher Linear Discriminant Multiple Discriminant Analysis etc Linear Discriminant functions Perceptron Criterion Function Relaxation Procedures Minimum Squared Error Procedures PCA SVM Ho Kashyap Procedures 1 4244 0367 7 06 20 00 2006 IEEE a Sizes vs classes b Size percentages vs classes Figure 1 Great unbalance between classes z Visual similarities between some classes See Figure 2 Unlike the other image databases for medical images sometimes even skilled experts cannot find the differences between some classes visually They may need to compare the images from different sources and refer to other medical examinations like blood Figure 2 Visual similarities between some classes z 1045 Variety in one class and difficulty to define discriminative visual features See Figure 3 Too many modalities vary in one class To find a general visual feature for one class is often very difficult In many cases medical similarities are far away from visual similarities ICME 2006 Figure 3 Variety in one class To face the difficulties mentioned above based on our former work 7 PCA and SVM are chosen as classifiers in this paper And different features from low level to middle level are considered Our contributions are z Construct a multi class classification system for medical images z Find the most efficient features for classification by designed simulation experiments some training data are used to simulate testing data 2 FEATURE SETS Feature extraction is a basic problem in image processing field After reviewing 56 CBIR content based image retrieval systems in 3 a summary of low level features are listed in 3 main categories color texture and shape plus a single features layout In 4 5 there are some similar categories of features The feature layout is the absolute or relative spatial position of the color It may include low resolution pixel map LRPM which is used in our method LRPM is a down scaled image of an initial one In our system texture maps are calculated on both initial images and filtered images Filtered images are generated from initial ones by filters like Gaussian to minimize the influence of noises Moreover texture histogram is calculated on these texture maps Figure 4 shows an example of textures and LRPM classification problem 9 According to 10 SVM for multipleclasses classification is still under development and generally there are two types of approaches One type has been to incorporate multiple class labels directly into the quadratic solving algorithm Another more popular type is to combine several binary classifiers We use SVMTorch which belongs to the latter Kernel selection is a crucial issue for SVM Different kernels will accommodate different nonlinear mappings and the performance of the resulting SVM will often hinge on the appropriate choice of the kernel 11 There are 4 kernels in SVMTorch linear polynomial radial basis function RBF sigmoid tanh In our method RBF is chosen 2 K x y e x y 2V 2 1 Besides the standard variance another parameter is the trade off between training error and the margin C To compare different methods effects PCA is also applied in our experiments A conventional PCA process starts from its generating matrix s construction Given a vector dataset training dataset including n images X
or
We will never post anything without your permission.
Don't have an account? Sign up
Unlocking...