DOC PREVIEW
MSU CSE 847 - boosting
Course Cse 847-
Pages 12

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BoostingInefficiency with BaggingImprove the Efficiency of BaggingIntuition: Education in ChinaAdaBoost AlgorithmAdaBoost Example: t=ln2How To Choose t in AdaBoost?Optimization View for Choosing tAdaBoost: A Greedy Approach to Optimize the Exponential FunctionSlide 10Empirical Study of AdaBoostBia-Variance Tradeoff for AdaBoostBoostingRong JinInefficiency with BaggingDBagging…D1D2DkBoostrap SamplingPr( | , )iic h x�h1h2hkInefficiency with boostrap sampling:Every example has equal chance to be sampledNo distinction between “easy” examples and “difficult” examplesInefficiency with model combinationA constant weight for each classifierNo distinction between accurate classifiers and inaccurate classifiersImprove the Efficiency of BaggingBetter sampling strategyFocus on the examples that are difficult to classify correctlyBetter combination strategyAccurate model should be assigned with more weightsIntuition: Education in ChinaTraining ExamplesX1Y1X2Y2X3Y3X4Y4MistakesX1Y1X3Y3 Classifier1Classifier2MistakesX1Y1+Classifier3 No training mistakes !! May overfitting to training data !!+AdaBoost AlgorithmAdaBoost Example: t=ln2x1, y1x2, y2x3, y3x4, y4x5, y51/5 1/5 1/51/51/5D0:x5, y5x3, y3x1, y1Sampleh1Training2/7 1/7 2/71/71/7D1:x1, y1x2, y2x3, y3x4, y4x5, y5Update Weightsh1   Samplex3, y3x1, y1h2Trainingx1, y1x2, y2x3, y3x4, y4x5, y5  h2Update Weights2/9 1/9 4/91/91/9D2:Sample …How To Choose t in AdaBoost?Consider how to construct the best distribution Dt+1(i) given Dt(i) and ht1. Dt+1(i) should be significantly differen from Dt(i)2. Dt+1(i) should create a situation that classifier ht performs poorly[ ]1[0,1][0,1]arg max KL( || )( )arg max ( ) ln( )s.t.( ) 1, ( ) ( ) 1/ 2mmt tDD itt ii iD D DD iD iD iD i h i y D i+��=== � =�� �[ ]11ln2 1( ) ( )tttt t iirrr h i y D ia+=-=�Optimization View for Choosing tht(x): x{1,-1}; a basis (weak) classifierHT(x): a linear combination of basic classifiersGoal: minimize training errorApproximate the training error with a exponential function1 1 2 21( ) ( ) ( ) ... ( )( ) ( )T T TT T TH x h x h x h xH x h xa a aa-= + + += +11( ( ) )NT i iierr I H x yN== -�11exp( ( ) )NT i iierr H x yN=� -�AdaBoost: A Greedy Approach to Optimize the Exponential Function11exp( ( ) )NT i iiH x yN=-�•Exponential cost function•Use the inductive form HT(x)=HT-1(x)+ThT(x))),(()exp())(exp(1)),(()exp())(exp(1))(exp())(exp(1))(exp(11111111iiTNiTiiTiiTNiTiiTNiiiTTiiTNiiiTyxhIyxHNyxhIyxHNyxhyxHNyxHNerrAdaBoost: A Greedy Approach to Optimize the Exponential Function11exp( ( ) )NT i iiH x yN=-�•Exponential cost function•Use the inductive form HT(x)=HT-1(x)+ThT(x))),(()exp())(exp(1)),(()exp())(exp(1))(exp())(exp(1))(exp(11111111iiTNiTiiTiiTNiTiiTNiiiTTiiTNiiiTyxhIyxHNyxhIyxHNyxhyxHNyxHNerrtiittitittttZyxhxDxDrr ))(exp()()( and )11ln(211•Minimize the exponential functionData points that hT(x) predict correctlyData points that hT(x) predict incorrectlyAdaBoost is a greedy approach  overfitting ?• Empirical studies show that AdaBoost is robust in general• AdaBoost tends to overfit with noisy dataEmpirical Study of AdaBoostAdaBoosting decision treesGenerate 50 decision trees through the AdaBoost procedureLinearly combine decision trees using the weights computed by the AdaBoost AlgorithmIn general:AdaBoost = Bagging > C4.5AdaBoost usually needs less number of classifiers than BaggingBia-Variance Tradeoff for AdaBoostAdaBoost can reduce both model variance and model biassingle decision treeBagging decision treebiasvarianceAdaBoosting decision


View Full Document

MSU CSE 847 - boosting

Course: Cse 847-
Pages: 12
Download boosting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view boosting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view boosting 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?