CMU CS 10601 - Boosting (17 pages)

Previewing pages 1, 2, 3, 4, 5, 6 of 17 page document View the full content.
View Full Document

Boosting



Previewing pages 1, 2, 3, 4, 5, 6 of actual document.

View the full content.
View Full Document
View Full Document

Boosting

116 views


Pages:
17
School:
Carnegie Mellon University
Course:
Cs 10601 - Introduction to Machine Learning
Introduction to Machine Learning Documents
Unformatted text preview:

11 9 2009 Boosting Machine Learning 10601 Geoff Gordon MiroslavDud k partly based on slides of Rob Schapire and Carlos Guestrin http www cs cmu edu ggordon 10601 November 9 2009 Ensembles of trees BAGGING and RANDOM FORESTS learn many big trees each tree aims to fit the same target concept random training sets randomized tree growth voting averaging DECREASE in VARIANCE BOOSTING learn many small trees weak classifiers each tree specializes to a different part of target concept reweight training examples higher weights where still errors voting increases expressivity DECREASE in BIAS 1 11 9 2009 Boosting boosting general method of converting rough rules of thumb e g decision stumps into highly accurate prediction rule Boosting boosting general method of converting rough rules of thumb e g decision stumps into highly accurate prediction rule technically assume given weak learning algorithm that can consistently nd classi ers rules of thumb at least slightly better than random say accuracy 55 in two class setting given su cient data a boosting algorithm can provably construct single classi er with very high accuracy say 99 2 11 9 2009 AdaBoost Freund Schapire 1995 3 11 9 2009 weak classifiers decision stumps vertical or horizontal half planes 4 11 9 2009 5 11 9 2009 A typical run of AdaBoost training error rapidly drops combining weak learners increases expressivity test error does not increase with number of trees T robustness to overfitting 6 11 9 2009 7 11 9 2009 Bounding true error Freund Schapire 1997 T number of rounds d VC dimension of weak learner m number of training examples 8 11 9 2009 Bounding true error a first guess A typical run contradicts a na ve bound 9 11 9 2009 Finer analysis margins Schapire et al 1998 Empirical evidence margin distribution 10 11 9 2009 Theoretical evidence large margins simple classifiers More technically Previously Bound depends on d VC dimension of weak learner m number of training examples entire distribution of training margins 11 11 9 2009 Practical advantages of AdaBoost Application detecting faces Viola Jones 2001 12 11 9 2009 Caveats Hard predictions can slow down learning 13 11 9 2009 Confidence rated Predictions Schapire Singer 1999 Confidence rated Predictions 14 11 9 2009 Confidence rated predictions help a lot Loss in logistic regression 15 11 9 2009 Loss in AdaBoost Logistic regression vs AdaBoost 16 11 9 2009 Benefits of model fitting view What you should know about boosting weak classifiers strong classifiers weak slightly better than random on training data strong eventually zero error on training data AdaBoost prevents overfitting by increasing margins regimes when AdaBoost overfits weak learner too strong use small trees or stop early data noisy stop early AdaBoost vs Logistic Regression exponential loss vs log loss single coordinate updates vs full optimization 17


View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Boosting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Boosting and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?