CMU CS 10601  Boosting (17 pages)
Previewing pages 1, 2, 3, 4, 5, 6 of 17 page document View the full content.Boosting
Previewing pages 1, 2, 3, 4, 5, 6 of actual document.
View the full content.View Full Document
Boosting
0 0 116 views
 Pages:
 17
 School:
 Carnegie Mellon University
 Course:
 Cs 10601  Introduction to Machine Learning
Introduction to Machine Learning Documents

40 pages

12 pages

36 pages

31 pages

32 pages

11 pages

18 pages

Expectation Maximization and Learning from partly Unobserved Data
29 pages

4 pages

23 pages

19 pages

17 pages

10 pages

21 pages

20 pages

26 pages

21 pages

21 pages

Computational Learning Theory`
20 pages

57 pages

36 pages

Directed Graphical Probabilistic Models
37 pages

15 pages

38 pages

39 pages

28 pages

MLE’s, Bayesian Classifiers and Naïve Bayes
19 pages

28 pages

46 pages

47 pages

18 pages

31 pages

48 pages

58 pages

52 pages

62 pages

22 pages

21 pages

25 pages

95 pages

Gaussian Naive Bayes and Logistic Regression
44 pages

Linear Classifiers and the Perceptron
9 pages

14 pages

26 pages

11 pages

29 pages

37 pages

PAC Learning and The VC Dimension
29 pages

35 pages

18 pages

58 pages

MLE’s, Bayesian Classifiers and Naïve Bayes
31 pages

26 pages

49 pages

5 pages

34 pages

35 pages

32 pages

28 pages

30 pages

29 pages

2 pages

41 pages

34 pages

38 pages

84 pages

31 pages

13 pages

41 pages

15 pages

21 pages

33 pages

38 pages

21 pages

37 pages

29 pages

42 pages
Sign up for free to view:
 This document and 3 million+ documents and flashcards
 High quality study guides, lecture notes, practice exams
 Course Packets handpicked by editors offering a comprehensive review of your courses
 Better Grades Guaranteed
11 9 2009 Boosting Machine Learning 10601 Geoff Gordon MiroslavDud k partly based on slides of Rob Schapire and Carlos Guestrin http www cs cmu edu ggordon 10601 November 9 2009 Ensembles of trees BAGGING and RANDOM FORESTS learn many big trees each tree aims to fit the same target concept random training sets randomized tree growth voting averaging DECREASE in VARIANCE BOOSTING learn many small trees weak classifiers each tree specializes to a different part of target concept reweight training examples higher weights where still errors voting increases expressivity DECREASE in BIAS 1 11 9 2009 Boosting boosting general method of converting rough rules of thumb e g decision stumps into highly accurate prediction rule Boosting boosting general method of converting rough rules of thumb e g decision stumps into highly accurate prediction rule technically assume given weak learning algorithm that can consistently nd classi ers rules of thumb at least slightly better than random say accuracy 55 in two class setting given su cient data a boosting algorithm can provably construct single classi er with very high accuracy say 99 2 11 9 2009 AdaBoost Freund Schapire 1995 3 11 9 2009 weak classifiers decision stumps vertical or horizontal half planes 4 11 9 2009 5 11 9 2009 A typical run of AdaBoost training error rapidly drops combining weak learners increases expressivity test error does not increase with number of trees T robustness to overfitting 6 11 9 2009 7 11 9 2009 Bounding true error Freund Schapire 1997 T number of rounds d VC dimension of weak learner m number of training examples 8 11 9 2009 Bounding true error a first guess A typical run contradicts a na ve bound 9 11 9 2009 Finer analysis margins Schapire et al 1998 Empirical evidence margin distribution 10 11 9 2009 Theoretical evidence large margins simple classifiers More technically Previously Bound depends on d VC dimension of weak learner m number of training examples entire distribution of training margins 11 11 9 2009 Practical advantages of AdaBoost Application detecting faces Viola Jones 2001 12 11 9 2009 Caveats Hard predictions can slow down learning 13 11 9 2009 Confidence rated Predictions Schapire Singer 1999 Confidence rated Predictions 14 11 9 2009 Confidence rated predictions help a lot Loss in logistic regression 15 11 9 2009 Loss in AdaBoost Logistic regression vs AdaBoost 16 11 9 2009 Benefits of model fitting view What you should know about boosting weak classifiers strong classifiers weak slightly better than random on training data strong eventually zero error on training data AdaBoost prevents overfitting by increasing margins regimes when AdaBoost overfits weak learner too strong use small trees or stop early data noisy stop early AdaBoost vs Logistic Regression exponential loss vs log loss single coordinate updates vs full optimization 17
View Full Document