UT Dallas CS 6375 - ADA-Boosting3_mit - D3096030

Home> Schools> University of Texas at Dallas> Computer Science (CS) > CS 6375> ADA-Boosting3_mit

DOC PREVIEW

UT Dallas CS 6375 - ADA-Boosting3_mit

School name University of Texas at Dallas

Course Cs 6375- Machine Learning

Pages 19

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 19 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

BOOSTING (ADABOOST ALGORITHM) Eric EmerConsider Horse-Racing Gambler • Rules of Thumb for determining Win/Loss: • Most favored odds • Fastest recorded lap time • Most wins recently, say, in the past 1 month • Hard to determine how he combines analysis of feature set into a single bet.Consider MIT Admissions • 2-class system (Admit/Deny) • Both Quantitative Data and Qualitative Data • We consider (Y/N) answers to be Quantitative (-1,+1) • Region, for instance, is qualitative.Rules of Thumb, Weak Classifiers • Easy to come up with rules of thumb that correctly classify the training data at better than chance. • E.g. IF “GoodAtMath”==Y THEN predict “Admit”. • Difficult to find a single, highly accurate prediction rule. This is where our Weak Learning Algorithm, AdaBoost, helps us.What is a Weak Learner? • For any distribution, with high probability, given polynomially many examples and polynomial time we can find a classifier with generalization error better than random guessing. ✏ <12, also denoted  > 0 for generalization error (12 )Weak Learning Assumption • We assume that our Weak Learning Algorithm (Weak Learner) can consistently find weak classifiers (rules of thumb which classify the data correctly at better than 50%) • Given this assumption, we can use boosting to generate a single weighted classifier which correctly classifies our training data at 99%-100%.AdaBoost Specifics • How does AdaBoost weight training examples optimally? • Focus on difficult data points. The data points that have been misclassified most by the previous weak classifier. • How does AdaBoost combine these weak classifiers into a comprehensive prediction? • Use an optimally weighted majority vote of weak classifier.AdaBoost Technical Description Missing details: How to generate distribution? How to get single classifier?Constructing Dt D1(i)=1mand given Dtand ht:Dt+1=Dt(i)Ztc(x)c(x)=⇢e↵t: yi= ht(xi)e↵t: yi6= ht(xi)Dt+1=Dt(i)Zte↵tyiht(xi)where Zt= normalization constant↵t=12ln1  ✏t✏t> 0Getting a Single Classifier Hfinal(x) = sign(Xt↵tht(x))Mini-ProblemTraining Error Analysis Claim: then,Proof • Step 1: unwrapping the recurrence • Step 2: Show • Step 3: Show training error(Hfinal) QtZtZt=2p✏t(1  ✏t)How might test error react to AdaBoost? We expect to encounter: • Occam’s Razor • OverfittingEmpirical results of test error •Test error does not increase even after 1000 rounds. •Test error continues to drop after training error reaches zero.Difference from Expectation: The Margins Explanation • Our training error only measures correctness of classifications, neglects confidence of classifications. How can we measure confidence of classifications? • Margin(x,y) close to +1 is high confidence, correct. • Margin(x,y) close to -1 is high confidence, incorrect. • Margin(x,y) close to 0 is low confidence. Hfinal(x) = sign(f(x))f(x)=Pt↵thtPt↵t2 [1, 1]margin(x, y)=yf(x)Empirical Evidence Supporting Margins Explanation Cumulative distribution of margins on training examples Hfinal(x) = sign(f(x))f(x)=Pt↵thtPt↵t2 [1, 1]margin(x, y)=yf(x)Pros/Cons of AdaBoost Pros • Fast • Simple and easy to program • No parameters to tune (except T) • No prior knowledge needed about weak learner • Provably effective given Weak Learning Assumption • versatile Cons • Weak classifiers too complex leads to overfitting. • Weak classifiers too weak can lead to low margins, and can also lead to overfitting. • From empirical evidence, AdaBoost is particularly vulnerable to uniform noise.Predicting College Football Results Training Data: 2009 NCAAF Season Test Data: 2010 NCAAF

View Full Document

UT Dallas CS 6375 - ADA-Boosting3_mit

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5-6 out of 19 pages.

UT Dallas CS 6375 - ADA-Boosting3_mit

Sign up for free to view:

Please select your school