1©2005-2007 Carlos Guestrin 1What’s learning?Point EstimationMachine Learning – 10701/15781Carlos GuestrinCarnegie Mellon UniversitySeptember 10th, 2007http://www.cs.cmu.edu/~guestrin/Class/10701/2©2005-2007 Carlos GuestrinWhat is Machine Learning ?23©2005-2007 Carlos GuestrinMachine LearningStudy of algorithms that improve their performance at some task with experience4©2005-2007 Carlos GuestrinObject detectionExample training imagesfor each orientation(Prof. H. Schneiderman)35©2005-2007 Carlos GuestrinText classificationCompany home page vsPersonal home page vsUniveristy home page vs…6©2005-2007 Carlos GuestrinReadinga noun(vs verb)[Rustandi et al.,2005]47©2005-2007 Carlos GuestrinModeling sensor data Measure temperatures atsome locations Predict temperaturesthroughout the environment[Guestrin et al. ’04] 8©2005-2007 Carlos GuestrinLearning to act Reinforcementlearning An agent Makes sensorobservations Must select action Receives rewards positive for “good”states negative for “bad”states[Ng et al. ’05] QuickTime™ and a decompressorare needed to see this picture.59©2005-2007 Carlos GuestrinGrowth of Machine Learning Machine learning is preferred approach to Speech recognition, Natural language processing Computer vision Medical outcomes analysis Robot control Computational biology Sensor networks … This trend is accelerating Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environment10©2005-2007 Carlos GuestrinSyllabus Covers a wide range of Machine Learningtechniques – from basic to state-of-the-art You will learn about the methods you heard about: Naïve Bayes, logistic regression, nearest-neighbor, decision trees, boosting, neuralnets, overfitting, regularization, dimensionality reduction, PCA, error bounds, VCdimension, SVMs, kernels, margin bounds, K-means, EM, mixture models, semi-supervised learning, HMMs, graphical models, active learning, reinforcementlearning… Covers algorithms, theory and applications It’s going to be fun and hard work 611©2005-2007 Carlos GuestrinPrerequisites Probabilities Distributions, densities, marginalization… Basic statistics Moments, typical distributions, regression… Algorithms Dynamic programming, basic data structures, complexity… Programming Mostly your choice of language, but Matlab will be very useful We provide some background, but the class will be fast paced Ability to deal with “abstract mathematical concepts”12©2005-2007 Carlos GuestrinRecitations Very useful! Review material Present background Answer questions Thursdays, 5:00-6:20 in Wean Hall 5409 Special recitation 1: tomorrow, Wean 5409, 5:00-6:20 Review of probabilities Special recitation 2 on Matlab Tuesday, Sept. 18th 4:30-5:50pm NSH 3002713©2005-2007 Carlos GuestrinStaff Four Great TAs: Great resource for learning, interact withthem! Joseph Gonzalez, Wean 5117, x8-3046, jegonzal@cs, Office hours:Tuesdays 7-9pm Steve Hanneke, Doherty 4301H, x8-7375, shanneke@cs, Officehours: Fridays 1-3pm Jingrui He, Wean 8102, x8-1299, jingruih@cs, Office hours:Wednesdays 11-1pm Sue Ann Hong, Wean 4112, x8-3047, sahong@cs, Office hours:Tuesdays 3-5pm Administrative Assistant Monica Hopes, x8-5527, meh@cs14©2005-2007 Carlos GuestrinFirst Point of Contact for HWs To facilitate interaction, a TA will be assigned toeach homework question – This will be your “firstpoint of contact” for this question But, you can always ask any of us For e-mailing instructors, always use: [email protected] For announcements, subscribe to: 10701-announce@cs https://mailman.srv.cs.cmu.edu/mailman/listinfo/10701-announce815©2005-2007 Carlos GuestrinText Books Required Textbook: Pattern Recognition and Machine Learning; Chris Bishop Optional Books: Machine Learning; Tom Mitchell The Elements of Statistical Learning: Data Mining, Inference,and Prediction; Trevor Hastie, Robert Tibshirani, JeromeFriedman Information Theory, Inference, and Learning Algorithms; DavidMacKay16©2005-2007 Carlos GuestrinGrading 5 homeworks (35%) First one goes out 9/12 Start early, Start early, Start early, Start early, Start early,Start early, Start early, Start early, Start early, Start early Final project (25%) Details out around Oct. 1st Projects done individually, or groups of two students Midterm (15%) Thu., Oct 25 5-6:30pm location: MM A14 Final (25%) TBD by registrar917©2005-2007 Carlos GuestrinHomeworks Homeworks are hard, start early Due in the beginning of class 3 late days for the semester After late days are used up: Half credit within 48 hours Zero credit after 48 hours All homeworks must be handed in, even for zero credit Late homeworks handed in to Monica Hopes, WEH 4619 Collaboration You may discuss the questions Each student writes their own answers Write on your homework anyone with whom you collaborate Each student must write their own code for the programming part Please don’t search for answers on the web, Google, previous years’homeworks, etc. please ask us if you are not sure if you can use a particular reference18©2005-2007 Carlos GuestrinSitting in & Auditing the Class Due to new departmental rules, every student who wantsto sit in the class (not take it for credit), must registerofficially for auditing To satisfy the auditing requirement, you must either: Do *two* homeworks, and get at least 75% of the points in each;or Take the final, and get at least 50% of the points; or Do a class project and do *one* homework, and get at least 75%of the points in the homework; Only need to submit project proposal and present poster, and get atleast 80% points in the poster. Please, send us an email saying that you will be auditingthe class and what you plan to do. If you are not a student and want to sit in the class,please get authorization from the instructor1019©2005-2007 Carlos GuestrinEnjoy! ML is becoming ubiquitous in science,engineering and beyond This class should give you the basic foundationfor applying ML and developing new methods The fun begins…20©2005-2007
View Full Document