ILLINOIS CS 446 - 090517.1 (5 pages)

Previewing pages 1, 2 of 5 page document View the full content.
View Full Document

090517.1



Previewing pages 1, 2 of actual document.

View the full content.
View Full Document
View Full Document

090517.1

34 views


Pages:
5
School:
University of Illinois - urbana
Course:
Cs 446 - Machine Learning

Unformatted text preview:

CS446 Machine Learning Fall 2017 Lecture 3 Overfitting Bayes Optimal and Naive Bayes Lecturer Sanmi Koyejo Scribe Shirdon Gorse Sep 5th 2017 Today s Objectives Generalization Bayes Optimal Classifier Naive Bayes Generalization Generalization Error Generalization is the ability of a learned model or classifier to fit unseen data instances Generalization error is a measure of how accurately an algorithm is able to predict outcome values for previously unseen data Suppose we have a classifier hn where the instance x relates to a label y This classifier hn is dependent on Dn where Dn xi yi and xi yi comes from a generated distribution P We say the Generalization Error Gn R hn P R hn Dn In other words the Generalization Error is equal to the Risk of the classifier hn over the generated distribution P the Risk of the classifier hn over the data set Dn As Gn goes to 0 we say that the classifier hn generalizes In practice we never see an infinite distribution P so their are two tricks to approximate R hn P Train test split where we evaluate the out of sample performance G n R hn DT EST R hn DT RAIN Cross Validation 1 2 3 Overfitting Bayes Optimal and Naive Bayes The goal is to define a dataset to test the model in the training phase in order to limit problems like overfitting and give an insight on how the model will generalize to an independent dataset G n ED R hn T estingDataset R hn T rainingDataset where ED is the average over all of the datasets The reason we average is to reduce the variance of the generalization error This is because Dn is random hn is random since it depends on Dn and Gn is therefore a random variable Since you get a new classifier hn for ever cross validation set which do you pick Majority Vote Re train on the entire data set Just pick one Is Generalization Enough No but why is it not good enough Example Suppose that random variable X is in the range B B the label y sign X and X is uniformly distributed accross B B We now consider the following for finding the training accuracy if our classifier hn 1 T raingAccuracy N 1 X 1 y h X 1 1 y h X 1 N n 1 N 1 X 1 y h X 1 N n 1 N 1 X 1 y 1 N n 1 With that logic we know that the training Accuracy the proportion of 1 s in our data The Generalization Acc hn P P y 1 1 2 Our classifier hn 1 has good generalization error as Acc hn P Acc hn Dn gets small But we know the best possible Accuracy 1 if we set the classifier h X sign X 3 Overfitting Bayes Optimal and Naive Bayes 3 Goal of Supervised ML The goal of supervised machine learning is to find



View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view 090517.1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 090517.1 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?