WSU CSE 6363 - Ensembles of Classifiers - D361443

Home> Schools> Washington State University> (CSE) > CSE 6363> Ensembles of Classifiers

WSU CSE 6363 - Ensembles of Classifiers

Pages 26

Download Save

Unformatted text preview:

1Ensembles of ClassifiersLarry HolderCSE 6363 – Machine LearningComputer Science and EngineeringUniversity of Texas at Arlington2References Dietterich, “Machine Learning Research: Four Current Directions,” AI Magazine, pp. 97-105, Winter 1997.3Learning Task Given a set S of training examples {(x1,y1),…,(xm,ym)} Sampled from unknown function y = f(x) Each xiis a feature vector <xi,1,…xi,n> of ndiscrete or real-valued features Class y є {1,…,K} Example may contain noise Find hypothesis happroximating f4Ensemble of Classifiers Goal Improve accuracy of supervised learning task Approach Use an ensemble of classifiers, rather than just one Challenges How to construct ensemble How to use individual hypotheses of ensemble to produce a classification5Ensembles of Classifiers Given ensemble of L classifiers h1,…,hL Decisions based on combination of individual hl E.g., weighted or unweighted voting How to construct ensemble whose accuracy is better than any individual classifier?6Ensembles of Classifiers Ensemble requirements Individual classifiers disagree Each classifier’s error < 0.5 Classifiers’ errors uncorrelated THEN, ensemble will outperform any hl7Ensembles of Classifiers (Fig. 1)P(l of 21 hypotheses errant)Each hypothesis has error 0.3Errors independentP(11 or more errant) = 0.0268Constructing Ensembles Sub-sampling the training examples One learning algorithm run on different sub-samples of training to produce different classifiers Works well for unstable learners, i.e., output classifier undergoes major changes given only small changes in training data Unstable learners Decision tree, neural network, rule learners Stable learners Linear regression, nearest-neighbor, linear-threshold (perceptron)9Sub-sampling the Training Set Methods Cross-validated committeesk-fold cross-validation to generate kdifferent training sets Learn kclassifiers Bagging Boosting10Bagging Given mtraining examples Construct Lrandom samples of size mwith replacement Each sample called a bootstrap replicate On average, each replicate contains 63.2% of training data Learn a classifier hlfor each of the Lsamples11Boosting Each of the mtraining examples weighted according to classification difficulty pl(x) Initially uniform: 1/m Training sample of size mfor iteration l drawn with replacement according to distribution pl(x) Learner biased toward higher-weight training examples – if learner can use pl(x) Error εlof classifier hlused to bias pl+1(x) Learn L classifiers Each used to modify weights for next learned classifier Final classifier a weighted vote of individual classifiers12AdaBoost (Fig. 2)13C4.5 with/without BoostingEach point represents1 of 27 test domains.14C4.5 with/without Bagging15Boosting vs. Bagging16Constructing Ensembles Manipulating input features Classifiers constructed using different subsets of features Works only when some redundancy in features17Constructing Ensembles Manipulating Output Targets When large number K of classes Generate L binary partitions of K classes Generate L classifiers for these 2-class problems Classify according to class whose partitions received most votes Similar to error-correcting codes Generally improves performance18Constructing Ensembles Injecting Randomness Multiple neural nets with different random initial weights Randomly-selected split attribute among top 20 in C4.5 Randomly-selected condition among top 20% in FOIL (Prolog rule learner) Adding Gaussian noise to input features Make random modifications to current hand use these classifiers weighted by their posterior probability (accuracy on training set)19Constructing Ensembles using Neural Networks Train multiple neural networks minimizing error and correlation with other networks’ predictions Use a genetic algorithm to generate multiple, diverse networks Have networks also predict various sub-tasks (e.g., one of the input features)20Constructing Ensembles Use several different types of learning algorithms E.g., decision tree, neural network, nearest neighbor Some learners’ error rates may be bad (i.e., > 0.5) Some learners’ predictions may be correlated Need to check using, e.g., cross-validation21Combining Classifiers Unweighted vote Majority vote If hlproduce class probability distributions P(f(x)=k | hl) Weighted vote Classifier weights proportional to accuracy on training data Learning combination Gating function (learn classifier weights) Stacking (learn how to vote)∑====LllhkxfPLkxfP1)|)((1))((22Why Ensembles Work Uncorrelated errors made by individual classifiers can be overcome by voting How difficult is it to find a set of uncorrelated classifiers? Why can’t we find a single classifier that does as well?23Finding Good Ensembles Typical hypothesis spaces Hare large Need are large number (ideally lg(|H|)) of training examples to narrow the search through H Typically, sample Sof size m << lg(|H|) The subset of hypotheses Hconsistent with Sforms a good ensemble24Finding Good Ensembles Typical learning algorithms Lemploy greedy search Not guaranteed to find optimal hypothesis (minimal size and/or minimal error) Generating hypotheses using different perturbations of Lproduces good ensembles25Finding Good Ensembles Typically, the hypothesis space Hdoes not contain the target function f Weighted combinations of several approximations may represent classifiers outside of HDecision surfacesdefined by learneddecision trees.Decision surfacedefined by vote overLearned decision trees.26Summary Advantages Ensemble of classifiers typically outperforms any one classifier Disadvantages Difficult to measure correlation between classifiers from different types of learners Learning time and memory constraints Learned concept difficult to

View Full Document


School:
Email:
New Password:
Confirm Password:

WSU CSE 6363 - Ensembles of Classifiers

Sign up for free to view:

Please select your school