MIT 6 867 - Lecture Notes - D1832244

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 867> Lecture Notes

MIT 6 867 - Lecture Notes

School name Massachusetts Institute of Technology

Course 6 867- Machine Learning

Pages 24

Download Save

Unformatted text preview:

Machine learning lecture 12 Tommi S Jaakkola MIT AI Lab tommi ai mit edu Topics Density estimation Parametric mixture models Estimation via the EM algorithm Examples Tommi Jaakkola MIT AI Lab 2 Why density estimation The digits again Possible uses understanding the generation process of examples clustering classification via class conditional densities inference based on incomplete observations Tommi Jaakkola MIT AI Lab 3 Parametric density models Probability model a parameterized family of probability distributions Example a simple multivariate Gaussian model p x 1 2 p 2 1 2 1 exp x T 1 x 2 This is a generative model in the sense that we can generate x s Tommi Jaakkola MIT AI Lab 4 Sampling from a Gaussian 1 dimensional Gaussian probability density function pdf p x 2 and the corresponding cumulative distribution function cdf F 2 x 0 4 1 0 9 0 35 0 8 0 3 0 25 pdf 0 7 p X x 0 6 0 2 cdf P X x 0 5 0 4 0 15 0 3 0 1 0 2 0 05 0 4 0 1 3 2 1 0 1 2 3 4 0 4 3 2 1 0 1 2 3 4 To draw a sample from a Gaussian we can invert the cumulative distribution function Z x F 2 x p z 2 dz 1 2 u Uniform 0 1 x F u p x 2 Tommi Jaakkola MIT AI Lab 5 Multi variate Gaussian samples A multivariate sample can be constructed from multiple independent one dimensional Gaussian samples zi p zi 0 2 1 z z1 zd T x 1 2z In this case x p x Tommi Jaakkola MIT AI Lab 6 Multi variate density estimation A mixture of Gaussians model p x k X pj p x j j i 1 where p1 pk 1 k 1 k contains all the parameters of the mixture model pj are known as mixing proportions or coefficients Tommi Jaakkola MIT AI Lab 7 Mixture density Data generation process 2 P y y 1 y 2 1 P x y 2 P x y 1 p x X P y j p x y j generic mixture j 1 2 X pj p x j j mixture of Gaussians j 1 2 Any data point x could have been generated in two ways Tommi Jaakkola MIT AI Lab 8 Mixture density If we are given just x we don t know which mixture component this example came from X p x pj p x j j j 1 2 We can evaluate the posterior probability that an observed x was generated from the first mixture component P y 1 p x y 1 P y 1 x P j 1 2 P y j p x y j p1 p x 1 1 j 1 2 pj p x j j P This solves a credit assignment problem Tommi Jaakkola MIT AI Lab 9 Mixture density posterior sampling Consider sampling x from the mixture density then y from the posterior over the components given x and finally x0 from the component density indicated by y x p x y P y x x0 p x0 y Is y a fair sample from the prior distribution P y Is x0 a fair sample from the mixture density p x0 Tommi Jaakkola MIT AI Lab 10 Mixture density estimation Suppose we want to estimate a two component mixture of Gaussians model p x p1 p x 1 1 p2 p x 2 2 If each example xi in the training set were labeled yi 1 2 according to which mixture component 1 or 2 had generated it then the estimation would be easy 2 1 Labeled examples no credit assignment problem Tommi Jaakkola MIT AI Lab 11 Mixture density estimation When examples are already assigned to mixture components labeled we can estimate each Gaussian independently 2 1 If n j is the number of examples labeled j then for each j 1 2 we set p j j n j n 1 X xi n j i y j i j 1 X xi j xi j T n j i y j i Tommi Jaakkola MIT AI Lab 12 Mixture density estimation credit assignment Of course we don t have such labels but we can guess what the labels might be based on our current mixture distribution We get soft labels or posterior probabilities of which Gaussian generated which example p j i P yi j xi P where j 1 2 p j i 1 for all i 1 n 2 1 5 1 0 5 0 0 5 0 5 0 0 5 1 1 5 2 When the Gaussians are almost identical as in the figure p 1 i p 2 i for almost any available point xi Even slight differences can help us determine how we should modify the Gaussians Tommi Jaakkola MIT AI Lab 13 The EM algorithm E step softly assign examples to mixture components p j i P yi j xi for all j 1 2 and i 1 n M step re estimate the parameters separately for the two Gaussians based on the soft assignments n X n j p j i Soft of examples labeled j i 1 p j j j n j n n X 1 p j i xi n j i 1 n X 1 p j i xi j xi j T n j i 1 Tommi Jaakkola MIT AI Lab 14 Mixture density estimation example 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 0 5 1 1 5 2 0 5 0 5 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 Tommi Jaakkola MIT AI Lab 0 5 1 1 5 2 0 5 0 5 0 0 5 1 1 5 2 0 0 5 1 1 5 2 15 Mixture density estimation 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 0 5 1 1 5 2 0 5 0 5 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 Tommi Jaakkola MIT AI Lab 0 5 1 1 5 2 0 5 0 5 0 0 5 1 1 5 2 0 0 5 1 1 5 2 16 Mixture density estimation 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 0 5 1 1 5 2 0 5 0 5 2 2 1 5 1 5 1 1 0 5 0 5 0 0 0 5 0 5 0 Tommi Jaakkola MIT AI Lab 0 5 1 1 5 2 0 5 0 5 0 0 5 1 1 5 2 0 0 5 1 1 5 2 17 The EM algorithm Each iteration of the EM algorithm monotonically increases the log likelihood of the n training examples x1 xn p xi n X z log p data log p1 …

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 867 - Lecture Notes

Sign up for free to view:

Please select your school