MSU CSE 847 - Expectation Maximization Algorithm - D2049632

Home> Schools> Michigan State University> Computer Science & Engineering (CSE) > CSE 847> Expectation Maximization Algorithm

MSU CSE 847 - Expectation Maximization Algorithm

School name Michigan State University

Course Cse 847-

Pages 61

Download Save

Unformatted text preview:

Expectation Maximization Algorithm Rong Jin A Mixture Model Problem 20 18 16 14 12 10 8 6 4 2 0 0 5 10 15 20 25 Apparently the dataset consists of two modes How can we automatically identify the two modes Gaussian Mixture Model GMM Assume that the dataset is generated by two mixed Gaussian distributions Gaussian model 1 q1 m1 s 1 p1 Gaussian model 2 q2 m2 s 2 p2 If we know the memberships for each bin estimating the two Gaussian models is easy How to estimate the two Gaussian models without knowing the memberships of bins EM Algorithm for GMM Let memberships to be hidden variables x1 x2 xn x1 m1 x2 m2 xn mn EM algorithm for Gaussian mixture model Unknown memberships x1 m1 x2 m2 xn mn q1 m1 s 1 p1 Unknown Gaussian models Learn these two sets of parameters iteratively q2 m2 s 2 p2 Start with A Random Guess 20 18 16 14 12 10 8 6 4 2 0 1 0 5 10 15 20 25 0 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 Random assign the memberships to each bin Start with A Random Guess 20 18 16 14 12 10 8 6 4 2 0 1 0 5 10 15 20 25 0 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 Random assign the memberships to each bin Estimate the means and variance of each Gaussian model E step Fixed the two Gaussian models Estimate the posterior for each data point p x q1 p x m1 s 1 p1 p x m 1 p m 1 x p x p x q1 p x q2 p x m1 s 1 p1 p x m2 s 2 p2 p x q2 p x m2 s 2 p2 p x m 2 p m 2 x p x p x q1 p x q2 p x m1 s 1 p1 p x m2 s 2 p2 p x m1 s 1 1 2ps 12 x m 2 1 exp 2s 12 x m 2 1 2 p x m1 s 1 exp 2 2s 22 2 ps 2 EM Algorithm for GMM 20 18 16 14 12 10 8 6 4 2 01 0 5 10 15 20 25 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 Re estimate the memberships for each bin M Step Fixed the memberships Weighted by posteriors Re estimate the two model Gaussian n l p mi 1 xi log p xi q1 p mi 2 xi log p xi q2 i 1 n p mi 1 xi log p1 log p xi m1 s 1 p mi 2 xi log p2 log p xi m2 s 2 i 1 Weighted by posteriors n n n 2 p m 1 x p m 1 x x p m 1 x x i i i i i i i i p1 i 1 m1 i n1 s 12 i n1 m12 n i 1 p mi 1 xi i 1 p mi 1 xi n n n 2 p m 2 x p m 2 x x p m 2 x x i i i i i i i i p2 i 1 m2 i n1 s 22 i n1 m22 n i 1 p mi 2 xi i 1 p mi 2 xi EM Algorithm for GMM 20 18 16 14 12 10 8 6 4 2 10 0 5 10 15 20 25 0 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 Re estimate the memberships for each bin Re estimate the models At the 5 th Iteration 20 18 16 14 12 10 8 6 4 2 0 90 0 5 10 15 0 5 10 15 20 25 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 20 25 Red Gaussian component slowly shifts toward the left end of the x axis At the10 th Iteration 20 18 16 14 12 10 8 6 4 2 0 0 9 0 5 10 15 20 25 5 10 15 20 25 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 Red Gaussian component still slowly shifts toward the left end of the x axis At the 20 th Iteration 20 18 16 14 12 10 8 6 4 2 0 1 0 5 10 0 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 15 20 25 Red Gaussian component make more noticeable shift toward the left end of the x axis At the 50 th Iteration 20 18 16 14 12 10 8 6 4 2 01 0 5 10 15 20 25 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 Red Gaussian component is close to the desirable location At the 100 th Iteration 20 18 16 14 12 10 8 6 4 2 01 0 5 10 15 20 25 5 10 15 20 25 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 The results are almost identical to the ones for the 50 th iteration EM as A Bound Optimization EM algorithm in fact maximizes the log likelihood function of training data Likelihood for a data point x p x p x q1 p x q2 p x m1 s 1 p1 p x m2 s 2 p2 p x m1 s 1 1 2ps 12 x m 2 1 exp 2s 12 2 x m 1 2 p x m1 s 1 exp 2 2s 22 2 ps 2 Log likelihood of training data l i 1 log p xi i 1 log p x m1 s 1 p1 p x m2 s 2 p2 n n EM as A Bound Optimization EM algorithm in fact maximizes the log likelihood function of training data Likelihood for a data point x p x p x q1 p x q2 p x m1 s 1 p1 p x m2 s 2 p2 p x m1 s 1 1 2ps 12 x m 2 1 exp 2s 12 2 x m 1 2 p x m1 s 1 exp 2 2s 22 2 ps 2 Log likelihood of training data l i 1 log p xi i 1 log p x m1 s 1 p1 p x m2 s 2 p2 n n EM as A Bound Optimization EM algorithm in fact maximizes the log likelihood function of training data Likelihood for a data point x p x p x q1 p x …

View Full Document


School:
Email:
New Password:
Confirm Password:

MSU CSE 847 - Expectation Maximization Algorithm

Sign up for free to view:

Please select your school