# MIT 9 520 - Bayesian Interpretations of Regularization (48 pages)

Previewing pages*1, 2, 3, 23, 24, 25, 26, 46, 47, 48*of 48 page document

**View the full content.**## Bayesian Interpretations of Regularization

Previewing pages *1, 2, 3, 23, 24, 25, 26, 46, 47, 48*
of
actual document.

**View the full content.**View Full Document

## Bayesian Interpretations of Regularization

0 0 61 views

Lecture Notes

- Pages:
- 48
- School:
- Massachusetts Institute of Technology
- Course:
- 9 520 - Statistical Learning Theory and Applications

**Unformatted text preview: **

Bayesian Interpretations of Regularization Charlie Frogner 9 520 Class 20 April 21 2010 C Frogner Bayesian Interpretations of Regularization The Plan Regularized least squares maps xi yi ni 1 to a function that minimizes the regularized loss n fS arg min f H 1X yi f xi 2 kf k2H 2 2 i 1 Can we interpret RLS from a probabilistic point of view C Frogner Bayesian Interpretations of Regularization Some notation S xi yi ni 1 is the set of observed input output pairs in Rd R the training set X and Y denote the matrices x1 xn T Rn d and y1 yn T Rn respectively is a vector of parameters in Rp p Y X is the joint distribution over outputs Y given inputs X and the parameters C Frogner Bayesian Interpretations of Regularization Where do probabilities show up n 1X V yi f xi kf k2H 2 2 i 1 becomes p Y f X p f Likelihood a k a noise model p Y f X Gaussian yi N f xi i2 Poisson yi Pois f xi Prior p f C Frogner Bayesian Interpretations of Regularization Where do probabilities show up n 1X V yi f xi kf k2H 2 2 i 1 becomes p Y f X p f Likelihood a k a noise model p Y f X Gaussian yi N f xi i2 Poisson yi Pois f xi Prior p f C Frogner Bayesian Interpretations of Regularization Estimation The estimation problem Given data xi yi N i 1 and model p Y f X p f Find a good f to explain data C Frogner Bayesian Interpretations of Regularization The Plan Maximum likelihood estimation for ERM MAP estimation for linear RLS MAP estimation for kernel RLS Transductive model Infinite dimensions get more complicated C Frogner Bayesian Interpretations of Regularization Maximum likelihood estimation Given data xi yi N i 1 and model p Y f X p f A good f is one that maximizes p Y f X C Frogner Bayesian Interpretations of Regularization Maximum likelihood and least squares For least squares noise model is yi f xi N f xi 2 a k a Y f X N f X 2 I So N X 1 1 2 exp p Y f X yi f xi 2 2 2 N 2 i 1 C Frogner Bayesian Interpretations of Regularization Maximum likelihood and least squares For least squares noise model is

View Full Document