9/21/20091Machine Learning - 10601Linear Regression andBias-Variance TradeoffGeoff Gordon, Miroslav Dudík(http://www.cs.cmu.edu/~ggordon/10601/September 21, 2009 Last time: linear regressionGoal:predict a continuous responsefrom continuous/categorical inputsInput:(x1,y1), … , (xN,yN)Model:y wjφj(x)Performance measure:9/21/20092Linear regressionGoal:Input: (x1,y1), … , (xN,yN)Assume: y wjφj(x)9/21/20093Linear regressiony ≈ w0+ w1x + w2x2Linear regressiony ≈ w0+ w1x + w2x2+ w3x3+ w4x4+ … + w10x109/21/20094Training & Test ErrorDon’t Matchdegree of polynomialtest errortrainingerrorTraining & Test ErrorDon’t Match: Why?errortrainerrortrue9/21/20095Training & Test ErrorTraining error:overly optimisticTest error:approximation of prediction erroras long astest data set never touched during trainingSweet spot formodel complexitydegree of polynomialtest errortrainingerror9/21/20096Sweet spot formodel complexityBias-variancetradeoffBias:• faithfulnessto the truthVariance:• sensitivity to randomnessin training data=true error= bias2+ variance + necessary evilBias:• decreases with model complexityVariance:• increases with model complexity• decreases withsize of data setdegree of polynomialtest errortrainingerror9/21/20097true error= bias2+ variance + necessary evilXY = f(X) + εprediction at x0:error(x0) = true error= bias2+ variance + necessary evil9/21/20098Announcements• project proposals duethis Wednesday at 10:30amLeast squares fit= max likelihoodfor Gaussians9/21/20099Beyond max likelihoodfor Gaussians• add a prior over w• replace the Gaussianby a different model– different noise model– different support for YRegression with a Gaussian priorYw9/21/200910Regression with a Gaussian priorYwstrength of regularizationtest errortrainingerrorBias-variance tradeofffor ridge regression9/21/200911strength of regularizationtest errortrainingerrorBias-variance tradeofffor ridge
View Full Document