10601 Machine Learning October 12, 2011 Mladen KolarOutline • Bias – Variance tradeoff • Linear regression • Bayes networksBIAS – VARIANCE TRADEOFFApplet for least squares http://mste.illinois.edu/users/exner/java.f/leastsquares/Decomposition of error Assume Generalization error Y = f(x) + ²err(x0) = E[(Y ¡^f(X))2jX = x0]err(x0) = ¾2+ (ED[^f(x0) ¡ f(x0)])2+ V arD(^f(x0))unavoidable error bias varianceBias Suppose that we have multiple datasets with n samples On each data set we learn On average (over different datasets) we learn Bias measures the difference between what you expect to learn and the truth – decreases with complexity of the model ^f(x)E[^f(x)]Variance Measures the difference between what you expect to learn and what you learn on a particular dataset. Measures how sensitive learner is to a specific dataset Decreases as we have simpler modelsModel complexity Hastie, Tibshirani, FriedmanLINEAR REGRESSIONLinear regression model Maximum conditional likelihood estimation y = f(x) + ²; ² » N(0; ¾2)p(yjx) = N(f(x); ¾2)f(x) = w0+XiwiÁi(x)^w = arg minwXl(yl¡XiwiÁi(xl))2Matrix of transformed featuresLinear regression (matrix equation) ^w = argminw(y ¡ Xw)0(y ¡ Xw)Final solution ^w = (X0X)¡1X0yBAYES NETSHead to Tail P(a,b|c) = P(a|c)P(b|c)Tail to Tail P(a,b|c) = P(a|c)P(b|c) Where have we seen this?Head to head P(a,b) = P(a)P(b)What is the Bayes Network for with no assumed conditional independencies? X1; : : : ; X4Bayes Network for a Hidden Markov Model Implies the future is conditionally independent of the past, given the presentBias – Variance tradeoff in Bayes nets Give an example of very biased Bayes network? Network with no edges Naïve Bayes Give an example of a network that has high variance? Network of a distribution with no conditional independence
View Full Document