Machine Learning Intro Aarti Singh Machine Learning 10 701 15 781 Sept 8 2010 You tell me This class is going to be interactive What is Machine Learning 2 What is Machine Learning 3 What is Machine Learning Study of algorithms that improve their performance at some task with experience Learning algorithm experience task performance 4 From Data to Understanding Machine Learning in Action 5 Machine Learning in Action Decoding thoughts from brain scans Rob a bank 6 Machine Learning in Action Stock Market Prediction Y X Feb01 7 Machine Learning in Action Document classification Sports Science News 8 Machine Learning in Action Spam filtering Spam Not spam 9 Machine Learning in Action Cars navigating on their own Boss the self driving SUV 1st place in the DARPA Urban Challenge Photo courtesy of Tartan Racing 10 Machine Learning in Action The best helicopter pilot is now a computer it runs a program that learns how to fly and make acrobatic maneuvers by itself no taped instructions joysticks or things like that http heli stanford edu 11 Machine Learning in Action Robot assistant http stair stanford edu 12 Machine Learning in Action Many many more Speech recognition Natural language processing Computer vision Web forensics Medical outcomes analysis Computational biology Sensor networks Social networks 13 Machine Learning in Action ML students and postdocs at G 20 Pittsburgh Summit 2009 courtesy A Gretton 14 ML is trending Wide applicability Very large scale complex systems Internet billions of nodes sensor network new multi modal sensing devices genetics human genome Huge multi dimensional data sets 30 000 genes x 10 000 drugs x 100 species x Software too complex to write by hand Improved machine learning algorithms Improved data capture Terabytes Petabytes of data networking faster computers Demand for self customization to user environment 15 ML has a long way to go 16 ML has a long way to go Speech Recognition gone Awry 17 What this course is about Covers a wide range of Machine Learning techniques from basic to state of the art You will learn about the methods you heard about Na ve Bayes logistic regression nearest neighbor decision trees boosting neural nets overfitting regularization dimensionality reduction PCA error bounds VC dimension SVMs kernels margin bounds K means EM mixture models semi supervised learning HMMs graphical models active learning reinforcement learning Covers algorithms theory and applications It s going to be fun and hard work 18 Machine Learning Tasks Broad categories Supervised learning Classification Regression Unsupervised learning Density estimation Clustering Dimensionality reduction Semi supervised learning Active learning Reinforcement learning Many more 19 Supervised Learning Feature Space Words in a document Market information up to time t Label Space Sports News Science Share Price 24 50 Task 20 Supervised Learning Classification Feature Space Label Space Words in a document Cell properties Sports News Science Anemic cell Healthy cell Discrete Labels 21 Supervised Learning Regression Feature Space Label Space Market information up to time t Gene Drug Share Price 24 50 Expression level 0 01 Continuous Labels 22 Supervised Learning problems Features Labels Classification Regression Temperature Weather prediction 23 Supervised Learning problems Features Labels Classification Regression Face Detection 24 Supervised Learning problems Features Labels Classification Regression Environmental Mapping 25 Supervised Learning problems Features Labels Classification Regression Robotic Control 26 Unsupervised Learning Aka learning without a teacher Feature Space Words in a document Word distribution Probability of a word Task 27 Unsupervised Learning Density Estimation Population density 28 Unsupervised Learning clustering Group similar things e g images Goldberger et al 29 Unsupervised Learning clustering web search results 30 Unsupervised Learning Embedding Dimensionality Reduction Saul Roweis 03 Images have thousands or millions of pixels Can we give each image a coordinate such that similar images are near each other 31 Unsupervised Learning Embedding Dimensionality Reduction words Joseph Turian 32 Unsupervised Learning Embedding Dimensionality Reduction words Joseph Turian 33 Machine Learning Tasks Broad categories Supervised learning Classification Regression Unsupervised learning Density estimation Clustering Dimensionality reduction Semi supervised learning Active learning Reinforcement learning Many more 34 Machine Learning Class webpage http www cs cmu edu aarti Class 10701 index html 35 Auditing To satisfy the auditing requirement you must either Do two homeworks and get at least 75 of the points in each or Take the final and get at least 50 of the points or Do a class project Only need to submit project proposal and present poster and get at least 80 points in the poster Please send the instructors an email saying that you will be auditing the class and what you plan to do 36 Prerequisites Probabilities Distributions densities marginalization Basic statistics Moments typical distributions regression Algorithms Dynamic programming basic data structures complexity Programming Mostly your choice of language but Matlab will be very useful We provide some background but the class will be fast paced Ability to deal with abstract mathematical concepts 37 Recitations Strongly recommended Brush up pre requisites Review material difficult topics clear misunderstandings extra new topics Ask questions Basics of Probability Thursday Sept 9 Tomorrow NSH 3305 Rob Hall 38 Textbooks Recommended Textbook Pattern Recognition and Machine Learning Chris Bishop Secondary Textbooks The Elements of Statistical Learning Data Mining Inference and Prediction Trevor Hastie Robert Tibshirani Jerome Friedman see online link Machine Learning Tom Mitchell Information Theory Inference and Learning Algorithms David MacKay 39 Grading 5 Homeworks 35 First one goes out next week watch email Start early Start early Start early Start early Start early Start early Start early Start early Start early Start early Final project 25 Details out around Sept 30th Projects done individually or groups of two students Midterm 20 Wed Oct 20 in class Final exam 20 TBD by registrar 40 Homeworks Homeworks are hard start early Due in the beginning of class 2 late days for the semester After late days are used up Half credit within 48 hours Zero credit after 48 hours Atleast 4 homeworks must be handed
View Full Document