CMU CS 10601 - Recitation - D275233

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 10601> Recitation

DOC PREVIEW

CMU CS 10601 - Recitation

School name Carnegie Mellon University

Course Cs 10601- Introduction to Machine Learning

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

10601 Machine Learning October 12, 2011 Mladen KolarOutline • Bias – Variance tradeoff • Linear regression • Bayes networksBIAS – VARIANCE TRADEOFFApplet for least squares http://mste.illinois.edu/users/exner/java.f/leastsquares/Decomposition of error Assume Generalization error Y = f(x) + ²err(x0) = E[(Y ¡^f(X))2jX = x0]err(x0) = ¾2+ (ED[^f(x0) ¡ f(x0)])2+ V arD(^f(x0))unavoidable error bias varianceBias Suppose that we have multiple datasets with n samples On each data set we learn On average (over different datasets) we learn Bias measures the difference between what you expect to learn and the truth – decreases with complexity of the model ^f(x)E[^f(x)]Variance Measures the difference between what you expect to learn and what you learn on a particular dataset. Measures how sensitive learner is to a specific dataset Decreases as we have simpler modelsModel complexity Hastie, Tibshirani, FriedmanLINEAR REGRESSIONLinear regression model Maximum conditional likelihood estimation y = f(x) + ²; ² » N(0; ¾2)p(yjx) = N(f(x); ¾2)f(x) = w0+XiwiÁi(x)^w = arg minwXl(yl¡XiwiÁi(xl))2Matrix of transformed featuresLinear regression (matrix equation) ^w = argminw(y ¡ Xw)0(y ¡ Xw)Final solution ^w = (X0X)¡1X0yBAYES NETSHead to Tail P(a,b|c) = P(a|c)P(b|c)Tail to Tail P(a,b|c) = P(a|c)P(b|c) Where have we seen this?Head to head P(a,b) = P(a)P(b)What is the Bayes Network for with no assumed conditional independencies? X1; : : : ; X4Bayes Network for a Hidden Markov Model Implies the future is conditionally independent of the past, given the presentBias – Variance tradeoff in Bayes nets Give an example of very biased Bayes network? Network with no edges Naïve Bayes Give an example of a network that has high variance? Network of a distribution with no conditional independence

View Full Document

CMU CS 10601 - Recitation

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-20-21 out of 21 pages.

CMU CS 10601 - Recitation

Sign up for free to view:

Please select your school