DOC PREVIEW
CMU CS 10601 - Boosting

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BoostingRecitation 9Oznur TastanOutline• Overview of common mistakes in midterm• BoostingSanity checks• Entropy for discrete variables is always non-negativeand equals zero only if the variable takes on a single value• Information gain is always non-negative2( ) ( ( )) ( ) ( ) ( )log ( )i i i iiiH X E I X p x I x p x p xSanity checksIn decision trees:• You cannot obtain a leaf that has no training examples • If a leaf contains examples from multiple classes, you predict themost common class. • If there are multiple, you predict any of the most common classes.Common mistakesMany people only stated one of either of the problems.Common mistakes6.3 Controlling overfittingIncrease the number of training examples in logistic regression, the bias remains unchanged. MLE is an approximately unbiased estimator.11 Bayesian networks’12 Graphical model inference Entries in potential tables aren't probabilitiesMany people forgot about the possibility of accidental independences.Boosting• As opposed to bagging and random forest learn many big trees• Learn many small trees (weak classifiers)Commonly used terms:Learner = Hypothesis = ClassifierBoosting• Given weak learner that can consistently classify the examples with error ≤1/2-γ• A boosting algorithm can provably construct single classifier with error ≤εwhere ε and γ are small.AdaBoostIn the first round all examples are equally weighted D_t(i)=1/NAt each run:Concentrate on the hardest ones:The examples that are misclassified in the previous run are weighted more so that the new learner focuses on them. At the end:Take a weighted majority vote.Formal descriptionthis is the classifier or hypothesisweighted errormistake on an example with high weightcosts much.weighted majority votethis is a distributionover examplesUpdating the distributionCorrectly predicted this exampledecrease the weight of the exampleMistaken.Increase the weight of the exampleUpdating Dtweighted error of the classifier1/2 ln((1-0.3)/0.3)When final hypothesis is too complexMargin of the classifierCumulative distribution of the marginsAlthough the final classifier is getting larger, the margins are increasing.Advantages• Fast• Simple and easy to program• No parameters to tune (except T)• Provably effective• Performance depends on the data and the weak learner• Can fail if the weak learners are too complex (overfitting)• If the weak classifiers are too simple (underfitting)ReferencesMiroslav Dudik lecture


View Full Document

CMU CS 10601 - Boosting

Documents in this Course
lecture

lecture

40 pages

Problem

Problem

12 pages

lecture

lecture

36 pages

Lecture

Lecture

31 pages

Review

Review

32 pages

Lecture

Lecture

11 pages

Lecture

Lecture

18 pages

Notes

Notes

10 pages

review

review

21 pages

review

review

28 pages

Lecture

Lecture

31 pages

lecture

lecture

52 pages

Review

Review

26 pages

review

review

29 pages

Lecture

Lecture

37 pages

Lecture

Lecture

35 pages

Boosting

Boosting

17 pages

Review

Review

35 pages

lecture

lecture

32 pages

Lecture

Lecture

28 pages

Lecture

Lecture

30 pages

lecture

lecture

29 pages

leecture

leecture

41 pages

lecture

lecture

34 pages

review

review

38 pages

review

review

31 pages

Lecture

Lecture

41 pages

Lecture

Lecture

15 pages

Lecture

Lecture

21 pages

Lecture

Lecture

38 pages

Notes

Notes

37 pages

lecture

lecture

29 pages

Load more
Download Boosting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Boosting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Boosting 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?