DOC PREVIEW
UT Dallas CS 6375 - ADA-Boosting3_mit

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BOOSTING (ADABOOST ALGORITHM) Eric EmerConsider Horse-Racing Gambler • Rules of Thumb for determining Win/Loss: • Most favored odds • Fastest recorded lap time • Most wins recently, say, in the past 1 month • Hard to determine how he combines analysis of feature set into a single bet.Consider MIT Admissions • 2-class system (Admit/Deny) • Both Quantitative Data and Qualitative Data • We consider (Y/N) answers to be Quantitative (-1,+1) • Region, for instance, is qualitative.Rules of Thumb, Weak Classifiers • Easy to come up with rules of thumb that correctly classify the training data at better than chance. • E.g. IF “GoodAtMath”==Y THEN predict “Admit”. • Difficult to find a single, highly accurate prediction rule. This is where our Weak Learning Algorithm, AdaBoost, helps us.What is a Weak Learner? • For any distribution, with high probability, given polynomially many examples and polynomial time we can find a classifier with generalization error better than random guessing. ✏ <12, also denoted  > 0 for generalization error (12 )Weak Learning Assumption • We assume that our Weak Learning Algorithm (Weak Learner) can consistently find weak classifiers (rules of thumb which classify the data correctly at better than 50%) • Given this assumption, we can use boosting to generate a single weighted classifier which correctly classifies our training data at 99%-100%.AdaBoost Specifics • How does AdaBoost weight training examples optimally? • Focus on difficult data points. The data points that have been misclassified most by the previous weak classifier. • How does AdaBoost combine these weak classifiers into a comprehensive prediction? • Use an optimally weighted majority vote of weak classifier.AdaBoost Technical Description Missing details: How to generate distribution? How to get single classifier?Constructing Dt D1(i)=1mand given Dtand ht:Dt+1=Dt(i)Ztc(x)c(x)=⇢e↵t: yi= ht(xi)e↵t: yi6= ht(xi)Dt+1=Dt(i)Zte↵tyiht(xi)where Zt= normalization constant↵t=12ln1  ✏t✏t> 0Getting a Single Classifier Hfinal(x) = sign(Xt↵tht(x))Mini-ProblemTraining Error Analysis Claim: then,Proof • Step 1: unwrapping the recurrence • Step 2: Show • Step 3: Show training error(Hfinal) QtZtZt=2p✏t(1  ✏t)How might test error react to AdaBoost? We expect to encounter: • Occam’s Razor • OverfittingEmpirical results of test error •Test error does not increase even after 1000 rounds. •Test error continues to drop after training error reaches zero.Difference from Expectation: The Margins Explanation • Our training error only measures correctness of classifications, neglects confidence of classifications. How can we measure confidence of classifications? • Margin(x,y) close to +1 is high confidence, correct. • Margin(x,y) close to -1 is high confidence, incorrect. • Margin(x,y) close to 0 is low confidence. Hfinal(x) = sign(f(x))f(x)=Pt↵thtPt↵t2 [1, 1]margin(x, y)=yf(x)Empirical Evidence Supporting Margins Explanation Cumulative distribution of margins on training examples Hfinal(x) = sign(f(x))f(x)=Pt↵thtPt↵t2 [1, 1]margin(x, y)=yf(x)Pros/Cons of AdaBoost Pros • Fast • Simple and easy to program • No parameters to tune (except T) • No prior knowledge needed about weak learner • Provably effective given Weak Learning Assumption • versatile Cons • Weak classifiers too complex leads to overfitting. • Weak classifiers too weak can lead to low margins, and can also lead to overfitting. • From empirical evidence, AdaBoost is particularly vulnerable to uniform noise.Predicting College Football Results Training Data: 2009 NCAAF Season Test Data: 2010 NCAAF


View Full Document

UT Dallas CS 6375 - ADA-Boosting3_mit

Documents in this Course
ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

vc

vc

24 pages

svm-2

svm-2

16 pages

svm-1

svm-1

18 pages

rl

rl

18 pages

mle

mle

16 pages

mdp

mdp

19 pages

knn

knn

11 pages

intro

intro

19 pages

hmm-train

hmm-train

26 pages

hmm

hmm

28 pages

ensemble

ensemble

17 pages

em

em

17 pages

dtree

dtree

41 pages

cv

cv

9 pages

bayes

bayes

19 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw5

hw5

2 pages

hw3

hw3

3 pages

20.mdp

20.mdp

19 pages

19.em

19.em

17 pages

16.svm-2

16.svm-2

16 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

21.rl

21.rl

18 pages

CNF-DNF

CNF-DNF

2 pages

ID3

ID3

4 pages

mlHw6

mlHw6

3 pages

MLHW3

MLHW3

4 pages

MLHW4

MLHW4

3 pages

ML-HW2

ML-HW2

3 pages

vcdimCMU

vcdimCMU

20 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

15.svm-1

15.svm-1

18 pages

14.vc

14.vc

24 pages

hw2

hw2

2 pages

hw1

hw1

4 pages

hw0

hw0

2 pages

hw3

hw3

3 pages

9.hmm

9.hmm

28 pages

5.mle

5.mle

16 pages

3.bayes

3.bayes

19 pages

2.dtree

2.dtree

41 pages

1.intro

1.intro

19 pages

Load more
Download ADA-Boosting3_mit
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view ADA-Boosting3_mit and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view ADA-Boosting3_mit 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?