Unformatted text preview:

ICS 235 Machine Learning Methods Department of Information and Computer Sciences University of Hawai i at M noa Kyungim Baek Reminder Homework assignment 1 Due 11 55 PM Wednesday September 18 Upload your notebook file ipynb in your Drop Box folder in Laulima Do not upload the data file Do not make any subfolders in your Drop Box Do NOT modify the provided code K NN classifier implementation Part 2 Question 3 Test your own implementation from Question 1 of Part 3 Use the implementation provided in scikit learn Part 2 2 1 Introduction to supervised learning with nearest neighbors Binary classification metrics Lecture 4 Accuracy Precision Recall Specificity F1 score Recall Supervised learning Dataset contains input output pairs xi yi xi d dimensional input or feature vector yi desired output Learn f xi yi X Model Algorithm Y The learned model f is used to predict target values of new examples 3 4 2 Food delivery problem Abby loves to order food online for home and work She wants to predict whether the order will be on time or late She logged the previous 45 orders BadWeather RushHour UrbanAddress Late MilesFromRe staurant 10 78 14 58 82 1 0 1 1 0 5 7 2 4 2 7 8 0 1 0 1 0 Example from Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides Slide credit P Sadowski 5 Two classes 1 late and 0 on time Food delivery problem What class does belong to Late On time t n a r u a t s e R m o r f s e l i M Example from Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides Slide credit P Sadowski 6 Bad Weather 1 0 1 1 0 3 Food delivery problem What class does belong to Look at the closest data point Calculate the distances from to all data points Find the nearest neighbor Predict that class f x yi where i argminj cid 3037 cid 3398 Example from Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides Adapted slide credit P Sadowski 7 Bad Weather Late On time Late On time t n a r u a t s e R m o r f s e l i M t n a r u a t s e R m o r f s e l i M Food delivery problem K Nearest Neighbors K NN predicts new data points based on K similar records from a dataset What class does belong to Look at the K closest data points For example let K 3 Calculate the distances from to all data points Find the K nearest neighbors Predict that majority class Example from Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides Slide credit P Sadowski 8 Bad Weather 4 K NN classifier Memorize training data Predict the class label based on the K nearest training examples How long does it take to train a NN classification model No time Just remember all the data What is the prediction time of a NN classifier for a single sample Scale with the size of the training data Questions 1 Assume that it takes T amount of time to train a K NN classifier with n samples How long would it take to train the same model with 2n samples 2 Assume that it takes T amount of time for a K NN classifier to predict the class label for a test sample given n training samples How long would it take for this classifier to make a prediction with 2n training samples 9 10 5 Limitations of K NN classifier Considers every training point in the space to make a decision Computing distance between every pair of points can be slow Curse of dimensionality points tend to be far apart as they are taken to higher dimensions Figure credit T Hastie 11 Metrics for binary classification Prediction Positive Negative True Positive False Negative e t a t S e u r T e v i t i s o P e v i t a g e N 18 1 3 15 True Positive Predicted Positive when the actual is Positive False Positive Predicted Positive when the actual is Negative False Negative Predicted Negative when the actual is Positive True Negative Predicted Negative when the actual is Negative False Positive True Negative Positive star Negative not star Slide credit Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides 12 6 Metrics for binary classification Accuracy Accuracy The percent ratio of cases classified correctly Prediction Positive Negative True Positive False Negative False Positive True Negative e t a t S e u r T e v i t i s o P e v i t a g e N 18 1 Slide credit Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides 13 Metrics for binary classification Accuracy Prediction Positive Negative True Positive False Negative False Positive True Negative e t a t S e u r T e v i t i s o P e v i t a g e N 2 2 High Accuracy Paradox Accuracy is misleading when dealing with imbalanced datasets e g few True Positives the rare class and many True Negatives the dominant class High Accuracy even when few True Positives Slide credit Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides 14 3 15 8 88 7 Metrics for binary classification Precision Precision Accuracy of a predicted positive outcome Prediction Positive Negative True Positive False Negative False Positive True Negative e t a t S e u r T e v i t i s o P e v i t a g e N Slide credit Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides 15 Metrics for binary classification Recall Prediction Positive Negative True Positive False Negative Recall Measures model s ability to predict a positive outcome a k a sensitivity or the true positive rate TPR e t a t S e u r T e v i t i s o P e v i t a g e N False Positive True Negative Slide credit Amazon Machine Learning Univ https github com aws samples aws machine learning university accelerated tab tree master slides 16 2 2 2 2 8 88 8 88 8 What metric do we care about Of course we would like to have high precision and recall But the balance depends on our domain Example cancer screening test Assume positive means the person has a cancer negative means they don t Recall accuracy of detecting cancer when the person actually has a cancer is really important Precision accuracy of detected cancer is less critical Prediction Pos Neg e t a t S e u …


View Full Document

UH Manoa ICS 235 - Lecture 4

Documents in this Course
Load more
Download Lecture 4
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 4 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?