CMU CS 10701 - Computational Learning Theory and Model Selection - D1337862

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 10701> Computational Learning Theory and Model Selection

CMU CS 10701 - Computational Learning Theory and Model Selection

School name Carnegie Mellon University

Course Cs 10701- Introduction to Machine Learning

Pages 19

Download Save

Unformatted text preview:

Computational Learning Theory and Model Selection Bin Zhao binzhao andrew cmu edu Outline True vs Empirical Risk Learning Theory The case of finite H The case of infinite H VC dimension Outline True vs Empirical Risk Learning Theory The case of finite H The case of infinite H VC dimension Slide from Aarti Singh Overfitting If we allow very complicated predictors we could overfit the training data Model Space with Increasing Complexity Nearest Neighbor classifiers with varying neighborhood sizes k 1 2 3 Small neighborhood Higher complexity Decision Trees with depth k or with k leaves Higher depth More leaves Higher complexity Regression with polynomials of order k 0 1 2 Higher degree Higher complexity Effect of Model Complexity Behavior of True Risk Behavior of True Risk Outline True vs Empirical Risk Learning Theory The case of finite H The case of infinite H VC dimension Preliminaries Hypothesis Class H We define the hypothesis class H used by a learning algorithm to be the set of all classifiers considered by it Linear classification classifier whose decision boundary is linear Neural networks classifier representable by some NN architecture remember HW 1 question on NN Empirical Risk Minimization Preliminaries Using just these two lemmas we will be able to prove some of the deepest and most important results in learning theory Finite Hypothesis Space Infinite Hypothesis Space Many hypothesis class including any parameterized by real numbers like linear classification actually contain an infinite number of functions Recall for finite hypothesis space VC H is like a substitute for k H Vapnik Chervonenkis Dimension A measure of the power or the complexity of the hypothesis space Higher VC dimension implies a more expressive hypothesis space Shattering A set of N points is shattered if there exists a hypothesis that is consistent with every classification of the N points VC Dimension VC Dimension of Linear Classifier d 2 Yes find a set of data points that can be shattered d 3 Yes d 4 No need to show there does not exist any data set with 4 points that can be shattered VC Dimension Key Thank you

View Full Document

CMU CS 10701 - Computational Learning Theory and Model Selection

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

CMU CS 10701 - Computational Learning Theory and Model Selection

Sign up for free to view:

Please select your school