11Boosting Nearest Neighbor Classifiers for Multiclass RecognitionVassilis Athitsos and Stan SclaroffPresented by Mohan Sridharan2K-Nearest Neighbors?? Nearest Neighbor (KNN) classifiers popular for multi-class recognition – vision, pattern recognition. KNN approaches work well for multi-class problems, but need a distance measure. KNN sensitive to choice of distance measure, especially at higher dimensions. Solution: learn the best distance measure??3Boosting – what and why? We have seen boosting techniques used in several applications. Essentially train a bunch of weak learners on the data. Final classification is a ‘combination’ of the weak learners. Good for high dimensional data, but works best with binary decision problems.4KNN + Boosting?? KNN good for multi-class problems but not great for high-dim data. Boosting great for high-dim data but ‘happier’ with binary problems. The combination of KNN and Boosting makes sense for high-dim multi-class data. Still faced with choice of distance measure, reducing multi-class problem to binary classification problem.5Contributions… Distance measure learned from data using boosting – linear weighted average of a set of distance measures. Reduction of multi-class classification problem to binary classification problem so that boosting can be used efficiently.6Triplet selection… Available: set of training samples of objects with class labels. Select triplet: Design classifiers based on distance measures for this set of triplets.YyXxsamplesmxyxii∈∈ ,))(,(),(),()()(),()(),,,(bqDaqDbyqyayqybaq<≠=27Associating Distances with Classifiers Define classifiers for every distance measure on input dataset of objects: If correctly classifies all triplets, then is a good measure for the corresponding KNN classifier.⎜⎜⎜⎝⎛>−=<=−=),(),(1),(),(0),(),(1),,(),(),(),,(~bqDaqDbqDaqDbqDaqDbaqDaqDbqDbaqDD~DD8Learning Weighted Distance Measure Given: Training set of objects with class labels, Set of distance measures. Choose a set of triplets. Evaluate distance measures as weak learners -- Generalized AdaBoost. Output a linear weighted combination of weak learners – linear weighted combination of distance measures.1112121111),(),(~HDxxDxxDDHoutdjjjoutdjjj===∑∑==αα9Remember K in KNN Sufficient: simple majority in K closest neighbors. Sufficient: classifies correctly all triples such that are among K nearest neighbors of among objects of class .ba,D~),,( baq)(),( byayq10Iterative Refinement T(D, r) = set of triples (q, a, b): q= object from the available training set. a = same class rthnearest neighbor. b = w-class rthnearest neighbor . T’(D, r) – union over T(D, r). Selection of r based on knowledge of ‘k’ in KNN… Sample from T’(D, rmax) and iterate until termination condition – whiteboard?? Theoretical considerations, Computational complexity…)(qyw ≠11Observations… Tested on 8 UCI datasets, including 3 visual datasets. Compared with AdaBoost (w/o distance measure learning) and Naïve KNN. No clear winner – the paper accepts this! Each algorithm works well for some datasets –current one does worse for the segmentation dataset ☺12Observations… Comparable performance with established algorithms – worth further analysis?? Unclear: choice of distance measures in ‘set’. Some magic numbers: size of training set. No Convergence guarantees – future work… Issues of scaling to high-dim and larger samples.313That’s all folks
View Full Document