1What’s learning?Point EstimationMachine Learning – 10701/15781Carlos GuestrinCarnegie Mellon UniversityJanuary 17th, 2007http://www.cs.cmu.edu/~guestrin/Class/10701/What is Machine Learning ?2Machine LearningStudy of algorithms that improve their performance at some task with experienceObject detectionExample training imagesfor each orientation(Prof. H. Schneiderman)3Text classificationCompany home page vsPersonal home page vsUniveristy home page vs…Readinga noun(vs verb)[Rustandi et al.,2005]4Modeling sensor data Measure temperatures atsome locations Predict temperaturesthroughout the environment[Guestrin et al. ’04] Learning to act Reinforcementlearning An agent Makes sensorobservations Must select action Receives rewards positive for “good”states negative for “bad”states[Ng et al. ’05] QuickTime™ and a decompressorare needed to see this picture.5Growth of Machine Learning Machine learning is preferred approach to Speech recognition, Natural language processing Computer vision Medical outcomes analysis Robot control … This trend is accelerating Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environmentSyllabus Covers a wide range of Machine Learningtechniques – from basic to state-of-the-art You will learn about the methods you heard about: Naïve Bayes, logistic regression, nearest-neighbor, decision trees, boosting, neuralnets, overfitting, regularization, dimensionality reduction, PCA, error bounds, VCdimension, SVMs, kernels, margin bounds, K-means, EM, mixture models, semi-supervised learning, HMMs, graphical models, active learning, reinforcementlearning… Covers algorithms, theory and applications It’s going to be fun and hard work 6Prerequisites Probabilities Distributions, densities, marginalization… Basic statistics Moments, typical distributions, regression… Algorithms Dynamic programming, basic data structures, complexity… Programming Mostly your choice of language, but Matlab will be very useful We provide some background, but the class will be fast paced Ability to deal with “abstract mathematical concepts”Review Sessions Very useful! Review material Present background Answer questions Thursdays, 5:30-6:50 in Wean Hall 5409 First recitation is tomorrow Review of probabilities Special recitation on Matlab Jan. 24 Wed. 5:30-6:50pm NSH 13057Staff Four Great TAs: Great resource for learning,interact with them! Andy Carlson, acarlson@cs Jonathan Huang, jch1@cs Purna Sarkar, psarkar@cs Brian Ziebart, bziebart@cs Administrative Assistant Monica Hopes, x8-5527, meh@csFirst Point of Contact for HWs To facilitate interaction, a TA will be assigned toeach homework question – This will be your “firstpoint of contact” for this question But, you can always ask any of us For e-mailing instructors, always use: [email protected] For announcements, subscribe to: 10701-announce@cs https://mailman.srv.cs.cmu.edu/mailman/listinfo/10701-announce8Text Books Required Textbook: Pattern Recognition and Machine Learning; Chris Bishop Optional Books: Machine Learning; Tom Mitchell The Elements of Statistical Learning: Data Mining, Inference,and Prediction; Trevor Hastie, Robert Tibshirani, JeromeFriedman Information Theory, Inference, and Learning Algorithms; DavidMacKayGrading 5 homeworks (30%) First one goes out 1/24 Start early, Start early, Start early, Start early, Start early,Start early, Start early, Start early, Start early, Start early Final project (20%) Details out Feb 26th Midterm (20%) March 7th in class Final (30%) May 15th, 1-4 p.m.9Homeworks Homeworks are hard, start early Due in the beginning of class 3 late days for the semester After late days are used up: Half credit within 48 hours Zero credit after 48 hours All homeworks must be handed in, even for zero credit Late homeworks handed in to Monica Hopes, WEH 4619 Collaboration You may discuss the questions Each student writes their own answers Write on your homework anyone with whom you collaborateSitting in & Auditing the Class Due to new departmental rules, every student who wantsto sit in the class (not take it for credit), must registerofficially for auditing To satisfy the auditing requirement, you must either: Do *two* homeworks, and get at least 75% of the points in each;or Take the final, and get at least 50% of the points; or Do a class project and do *one* homework, and get at least 75%of the points in the homework; Only need to submit project proposal and present poster, and get atleast 80% points in the poster. Please, send us an email saying that you will be auditingthe class and what you plan to do. If you are not a student and want to sit in the class,please get authorization from the instructor10Enjoy! ML is becoming ubiquitous in science,engineering and beyond This class should give you the basic foundationfor applying ML and developing new methods The fun begins…Your first consulting job A billionaire from the suburbs of Seattle asksyou a question: He says: I have thumbtack, if I flip it, what’s theprobability it will fall with the nail up? You say: Please flip it a few times: You say: The probability is:He says: Why??? You say: Because…11Thumbtack – Binomial Distribution P(Heads) = θ, P(Tails) = 1-θ Flips are i.i.d.: Independent events Identically distributed according to Binomialdistribution Sequence D of αH Heads and αT TailsMaximum Likelihood Estimation Data: Observed set D of αH Heads and αT Tails Hypothesis: Binomial distribution Learning θ is an optimization problem What’s the objective function? MLE: Choose θ that maximizes the probability ofobserved data:12Your first learning algorithm Set derivative to zero:How many flips do I need? Billionaire says: I flipped 3 heads and 2 tails. You say: θ = 3/5, I can prove it! He says: What if I flipped 30 heads and 20 tails? You say: Same answer, I can prove it! He says: What’s better? You say: Humm… The more the merrier??? He says: Is this why I am paying
View Full Document