DOC PREVIEW
STEVENS CS 559 - CS 559 11th Set of Notes

This preview shows page 1-2-3-4-5-6-40-41-42-43-44-82-83-84-85-86-87 out of 87 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 559: Machine LearningCS 559: Machine Learning Fundamentals and Applications11thSet of NotesInstructor: Philippos MordohaiWebpage: www.cs.stevens.edu/~mordohaiEmail:Philippos Mordohai@stevens eduE-mail: [email protected]: Lieb 215Pattern Classification, Chapter 1Pop-Up Quiz8PopUp Quiz 82AnnouncementAnnouncement•No office hours on TuesdayApril 20No office hours on Tuesday April 20– Email me for Monday or Wednesday appointmentsappointments3Project LogisticsProject Logistics•Present project in class on April 29Present project in class on April 29• Submit brief report by May 5 (midnight)68–6-8 pages • Present poster in CS Department event (optional, 5% bonus)Pattern Classification, Chapter 1 4Outline • Ensemble methods–Bagging–Boosting– Random forestsPattern Classification, Chapter 5 5ResourcesResources•Slides based on:Slides based on:– Tutorial by Rob Schapire– Tutorial by Yoav Freund– Slides by Carlos Guestrin– Tutorial by Paul Viola –Tutorial by Ron Meir– Slides by Aurélie LemmensSlides byZhuowenTu–Slides by ZhuowenTu•Not in DHS book•Not in DHS book6CodeCode•AntonioTorralba(MIT)Antonio Torralba(MIT)– http://people.csail.mit.edu/torralba/shortCourseRLOC/boosting/boosting htmleRLOC/boosting/boosting.html• GML AdaBoosthttp://graphics cs msu ru/ru/science/research/–http://graphics.cs.msu.ru/ru/science/research/machinelearning/adaboosttoolboxGuyLeshem(AdaBoostand more)•Guy Leshem(AdaBoostand more)– http://shum.huji.ac.il/~gleshem/7Ensemble MethodsEnsemble Methods•Bagging (Breiman1994,)Bagging (Breiman1994,…)• Boosting (Freund and Schapire 1995, Friedman et al. 1998,…)ed a et a 998,)• Random forests (Breiman 2001,…)Predict class label for unseen data by aggregating a set of predictions (classifiersaggregating a set of predictions (classifiers learned from the training data).8General IdeaSTraining DataDataS1S2SnMultiple Data SetsC1C2CnMultiple ClassifiersHCombined Classifier9Ensemble Classifiers• Basic idea: Build different “experts” and let them votelet them vote• Advantages: • Improve predictive performancepp p• Different types of classifiers can be directly includedl• Easy to implement• Not too much parameter tuning•Disadvantages:•Disadvantages:• The combined classifier is not transparent (black box)transparent (black box)• Not a compact representation10Why do they work?Why do they work?•Suppose there are 25 base classifiers•Suppose there are 25 base classifiers• Each classifier has error rate • Assume independence among classifiers35.0pg• Probability that the ensemble classifier makes a wrong prediction:06.0)1(2525i25i)(13ii11BaggingBagging• Traininggo Given a dataset S, at each iteration i, a training set Siis sampled with replacement from S (i.e. bootstraping)o A classifier Ciis learned for each Si• Classification: given an unseen sample Xgpo Each classifier Cireturns its class predictiono The bagged classifier H counts the votes and assigns the gg gclass with the most votes to X12BaggingBagging• Bagging works because it reduces variance by gg g yvoting/averagingo In some pathological hypothetical situations the overall pgyperror might increaseo Usually, the more classifiers the better• Problem: we only have one dataset• Solution: generate new ones of size n by bootstrapping, i.e. sampling with replacement•Can help a lot if data isnoisyCan help a lot if data is noisy13The Challenge : Reducing Churn•The ChallengeThe Challenge : Reducing Churn•The Challenge– U.S. Wireless Telecom company – To discover whether the customer will churn or not during the next few monthsnot during the next few months Churn is the response variable such that:Ch 1 if th t i h iChurn=1 if the customer is churning Churn= -1 if the customer is not churning14Opportunity for this ChallengeOpportunity for this Challenge•Bagging and BoostingBagging and BoostingAggregating ClassifiersFINAL RULE Breiman (1996) found gains in accuracy by aggregating predictors built from reweighedversions of the learning set15Bagging•Bagging=BootstrapAggregating•Bagging= Bootstrap AggregatingRihifthl i tidb•Reweighing of the learning sets is done by drawingat randomwith replacementfrom thedrawingat random with replacementfrom the learning setslearning sets•Predictors are aggregated bypluralityvoting•Predictors are aggregated by pluralityvoting16The Bagging Algorithm• B bootstrap samples• From which we derive:B321B10321Bcccc ,...,,, :1,1321 B ClassifiersBpppp,...,,,:1,0321B Estimated probabilitiesThe aggregate classifier becomes:The aggregate classifier becomes:orBbxcsignxc)(1)(Bbxpxp)(1)(or17bbagxcBsignxc1)()(bbagxpBxp1)()(Initial setWeightingDrawing with replacement 1ggClassifier 1Drawing with replacement 1Drawing with replacement 2Classifier 218AggregationInitial setSignClassifier 1Classifier 1Classifier 2Classifier 3…Classifier TFinal rule=19BoostingBoosting20BoostingBoosting•Invented independently bySchapireInvented independently by Schapire(1989) and Freund (1990)Later joined forces–Later joined forcesMi id t itlifib•Main idea: train a strong classifier by combining weak classifiersf–Practically useful– Theoretically interesting21BoostingBoosting•Idea: given a set of weak learners, run themIdea: given a set of weak learners, run them multiple times on (reweighted) training data, then let learned classifiers vote• At each iteration t:– Weight each training example by how incorrectly it l ifi dit was classified– Learn a hypothesis –ht–Choose a strength for this hypothesis–α–Choose a strength for this hypothesis –αt• Final classifier: weighted combination of weak learnersweak learners22Learning from Weighted DataLearning from Weighted Data•Sometimes not all data points are equalSometimes not all data points are equal– Some data points are more equal than others• Consider a weighted datasetg– D(i) – weight of ithtraining example (xi,yi)– Interpretations:•ithtraining example counts as D(i) examples• If I were to “resample” data, I would get more samples of “heavier” data pointsp• Now, in all calculations the ithtraining example counts as D(i) “examples”23Definition of BoostingDefinition of Boosting•Given training


View Full Document

STEVENS CS 559 - CS 559 11th Set of Notes

Download CS 559 11th Set of Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 559 11th Set of Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 559 11th Set of Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?