This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

15-381: Artificial IntelligenceRegression and cross validationWhere we areInputsClassifierPredictcategoryInputsDensityEstimatorProb-abilityInputsRegressorPredictreal no.√Today√Linear regression• Given an input x wewould like to compute anoutput y• For example: - Predict height from age - Predict Google’s pricefrom Yahoo’s price - Predict distance fromwall from sensorsXYLinear regression• Given an input x we would like tocompute an output y• In linear regression we assumethat y and x are related with thefollowing equation: y = wx+ε where w is a parameter and εrepresents measurement orother noiseXYWhat we aretrying to predictObserved values• Our goal is to estimate w from a trainingdata of <xi,yi> pairs• This could be done using a least squaresapproach• Why least squares? - minimizes squared distance betweenmeasurements and predicted line - has a nice probabilistic interpretation - easy to computeLinear regression!"iiiwwxy2)(minargXY!+= wxyIf the noise is Gaussianwith mean 0 then leastsquares is also themaximum likelihoodestimate of wSolving linear regression• You should be familiar with this by now …• We just take the derivative w.r.t. to w and set to 0:!!! !!!!="="=#"##=#$$iiiiii iiiiiiiiiiiiiiixyxwwxyxwxyxwxyxwxyw2220)(2)(2)(Regression example• Generated: w=2• Recovered: w=2.03• Noise: std=1Regression example• Generated: w=2• Recovered: w=2.05• Noise: std=2Regression example• Generated: w=2• Recovered: w=2.08• Noise: std=4Affine regression• So far we assumed that theline passes through the origin• What if the line does not?• No problem, simply change themodel to y = w0 + w1x+ε• Can use least squares todetermine w0 , w1nxwywiii!"=10XYw0!!"=iiiiixwyxw201)(Affine regression• So far we assumed that theline passes through the origin• What if the line does not?• No problem, simply change themodel to y = w0 + w1x+ε• Can use least squares todetermine w0 , w1nxwywiii!"=10XYw0!!"=iiiiixwyxw201)(Just a second, we will soongive a simpler solutionMultivariate regression• What if we have several inputs? - Stock prices for Yahoo, Microsoft and Ebay forthe Google prediction task• This becomes a multivariate regression problem• Again, its easy to model: y = w0 + w1x1+ … + wkxk + εNotations:Lower case: variable or parameter (w0)Lower case bold: vector (w)Upper case bold: matrix (X)Multivariate regression: Leastsquares• We are now interested in a vector wT = [w0, w1 ,… , wk]• It would be useful to represent this in matrix notations: ! X =X1MXn" # $ $ $ % & ' ' ' =1 x11L x1k1 x21L x2kM M L M1 xn1L xnk" # $ $ $ $ % & ' ' ' ' !!!!"#$$$$%&=nyyyM21y• We can thus re-write our model as y = Xw+ε• The solution turns out to be: w = (XTX)-1XTy• This is an instance of a larger set of computational solutions whichare usually referred to as ‘generalized least squares’Multivariate regression: Leastsquares• We can re-write our model as y = Xw• The solution turns out to be: w = (XTX)-1XTy• The is an instance of a larger set of computational solutions whichare usually referred to as ‘generalized least squares’• XTX is a k by k matrix• XTy is a vector with k entriesWhy is (XTX)-1XTy the right solution?Hint: Multiply both sides of the original equation by (XTX)-1XT• Can also generalize these classes of functions to benon-linear functions of the inputs x but still linear in theparameters w.Beyond linear regression ! f (x,w) = w0+ w1x + w2x2+ L + wmxmPolynomial regression examplesOver fitting• With too few training examples our polynomialregression model may achieve zero training error butnevertheless has a large generalization error• When the training error no longer bears any relationto the generalization error we say that the functionoverfits the (training) data0)),;((0)),;((1210~),(1210>>!"!#=wwxfyEwwxfynPyxniii• Cross-validation allows us to estimate the generalizationerror based on training examples alone.• We learn a model using a subset of the training data andestimate the generalization error using the rest of the data• We chose the model (for example polynomial order) thatminimizes the error on the held out dataCross validationCommon strategies - Leave one out cross validation - Leave a bigger subset - Train and test setsCross validation:


View Full Document

CMU CS 15381 - Lecture

Documents in this Course
Planning

Planning

19 pages

Planning

Planning

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

27 pages

FOL

FOL

41 pages

lecture

lecture

34 pages

Exam

Exam

7 pages

Lecture

Lecture

22 pages

Handout

Handout

11 pages

Midterm

Midterm

14 pages

lecture

lecture

83 pages

Handouts

Handouts

38 pages

mdp

mdp

37 pages

HW2

HW2

7 pages

nn

nn

25 pages

lecture

lecture

13 pages

Handout

Handout

5 pages

Lecture

Lecture

27 pages

Lecture

Lecture

62 pages

Lecture

Lecture

5 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?