# ILLINOIS CS 446 - 090517.2 (4 pages)

Previewing page 1 of 4 page document
View Full Document

## 090517.2

Previewing page 1 of actual document.

View Full Document
View Full Document

## 090517.2

29 views

Pages:
4
School:
University of Illinois - urbana
Course:
Cs 446 - Machine Learning
##### Machine Learning Documents

Unformatted text preview:

CS446 Machine Learning Fall 2017 Lecture 3 Bias Variance Tradeoff Overfitting Bayes optimal Probabilistic Models Naive Bayes Lecturer Sanmi Koyejo 1 Scribe Xinchen Pan Sep 5th 2017 Bias Variance Trade off Suppose that we have a data matrix x and a set of values y associated to it We try to use a function f to get a prediction y f x where E 0 and V ar 2 Since we have the noise term the prediction values we get will not be exactly same as the true values We denote the prediction to be f x and define a term mean square error which is E y f x 2 We then decompose it E y f x 2 E y 2 f x 2 2y f x E y 2 E f x 2 2E y f x V ar y E y 2 V ar f x E f x 2 2E y f x V ar y V ar f x E y f x 2 2 Variance bias2 Thus 1 1 Variance E f x 2 E f x 2 Bias E y f x Example For a binary classification problem we let all the predictions to be 1 f x 1 1 2 3 Bias Variance Tradeoff Overfitting Bayes optimal Probabilistic Models Naive Bayes Then we will have a variance of 0 However we will have a large bias because the predictions are just like random guess which cannot be accurate 2 Overfitting and Underfitting 2 1 Overfitting Overfitting problem occurs when we have a low training error but high testing error In another word the learning algorithm does not generalize Denote hn as a learning algorithm Dtrain as the training dataset Dtest as the testing dataset R to be a risk function When overfitting we have R hn Dtrain R hn Dtest The training error is much smaller than the testing error 2 1 1 Exmple Consider a k nearest neighbour example for k 1 We will have a training error of 0 because the nearest neighbour in the training set will always be the point itself However when we have a testing set the nearest neighbour for the point in the testing set will not be itself since the point in the testing set will not be used for calculating distance 2 2 Underfitting Underfitting occurs when when have both low training error and low testing error It means that the learning algorithm cannot represent the

View Full Document

## Access the best Study Guides, Lecture Notes and Practice Exams Unlocking...