DOC PREVIEW
CMU CS 10701 - EM for HMMs a.k.a. The Baum-Welch Algorithm

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Recommended reading An Introduction to HMMs and Bayesian Networks Z Ghahramani Int Journal of Pattern Recognition and AI 15 1 9 42 2001 Especially Section 4 EM for HMMs a k a The Baum Welch Algorithm Machine Learning 10701 15781 Carlos Guestrin Carnegie Mellon University April 12th 2006 1 Learning HMMs from fully observable data is easy X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Learn 3 distributions 2 Learning HMMs from fully observable data is easy X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Learn 3 distributions What if O is observed but X is hidden 3 Log likelihood for HMMs when X is hidden Marginal likelihood O is observed X is missing For simplicity of notation training data consists of only one sequence If there were m sequences 4 Computing Log likelihood for HMMs when X is hidden X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 5 Computing Log likelihood for HMMs when X is hidden variable elimination X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Can compute efficiently with variable elimination 6 EM for HMMs when X is hidden X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 E step Use inference forwards backwards algorithm M step Recompute parameters with weighted data 7 E step X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 E step computes probability of hidden vars x given o Will correspond to inference use forward backward algorithm 8 The M step X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Maximization step Use expected counts instead of counts If learning requires Count x o Use EQ t 1 Count x o 9 Decomposition of likelihood revisited X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Likelihood optimization decomposes 10 Starting state probability P X1 Using expected counts P X1 a X1 a 11 Transition probability P Xt Xt 1 Using expected counts P Xt a Xt 1 b Xt a Xt 1 b 12 Observation probability P Ot Xt Using expected counts P Ot a Xt b Ot a Xt b 13 E step revisited X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 E step computes probability of hidden vars x given o Must compute Q xt a o marginal probability of each position Q xt 1 a xt b o joint distribution between pairs of positions 14 The forwards backwards algorithm X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Initialization For i 2 to n Generate Initialization For i n 1 to 1 Generate a forwards factor by eliminating Xi 1 a backwards factor by eliminating Xi 1 i probability is 15 E step revisited X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 E step computes probability of hidden vars x given o Must compute Q xt a o marginal probability of each position Just forwards backwards Q xt 1 a xt b o joint distribution between pairs of positions Homework 16 What can you do with EM for HMMs 1 Clustering sequences X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Independent clustering Sequence clustering 17 What can you do with EM for HMMs 2 Exploiting unlabeled data X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 Labeling data is hard work save graduate student time by using both labeled and unlabeled data Labeled data X brace O Unlabeled data X O 18 Exploiting unlabeled data in clustering A few data points are labeled x o Most points are unlabeled o In the E step of EM If i th point is unlabeled If i th point is labeled compute Q X oi as usual set Q X x oi 1 and Q X x oi 0 M step as usual 19 20 20 Newsgroups data advantage of adding unlabeled data 21 20 Newsgroups data Effect of additional unlabeled data 22 Exploiting unlabeled data in HMMs X1 a z X2 a z X3 a z X4 a z X5 a z O1 O2 O3 O4 O5 A few data points are labeled Most points are unlabeled x o o In the E step of EM If i th point is unlabeled If i th point is labeled compute Q X oi as usual set Q X x oi 1 and Q X x oi 0 M step as usual Speed up by remembering counts for labeled data 23 What you need to know Baum Welch EM for HMMs E step Inference M step Use using forwards backwards weighted counts Exploiting unlabeled data Some unlabeled data can help classification Small change to EM algorithm In E step only use inference for unlabeled data 24 Acknowledgements Experiments combining labeled and unlabeled data provided by Tom Mitchell 25 EM for Bayes Nets Machine Learning 10701 15781 Carlos Guestrin Carnegie Mellon University April 12th 2006 26 Data likelihood for BNs Flu Allergy Sinus Given structure log likelihood of fully observed data Nose Headache 27 Marginal likelihood Flu Allergy Sinus What if S is hidden Nose Headache 28 Log likelihood for BNs with hidden data Flu Marginal likelihood O is observed H is hidden Allergy Sinus Nose Headache 29 E step for BNs Flu Allergy Sinus Nose Headache E step computes probability of hidden vars h given o Corresponds to inference in BN 30 Flu The M step for BNs Allergy Sinus Nose Headache Maximization step Use expected counts instead of counts If learning requires Count h o Use EQ t 1 Count h o 31 Flu M step for each CPT Allergy Sinus Nose Headache M step decomposes per CPT Standard M step MLE uses expected counts 32 Flu Computing expected counts Allergy Sinus Nose Headache M step requires expected counts For a set of vars A must compute ExCount A a Some of A in example j will be observed denote by AO aO j Some of A will be hidden denote by AH Use inference E step computes expected counts ExCount t 1 AO aO j AH aH P AH aH AO aO j t 33 Data need not be hidden in the same way Nose Headache A data point is A data point is But unobserved variables can be different for different data points Sinus When data is partially observed Allergy When data is fully observed Flu e g Same framework just change definition of expected counts ExCount t 1 AO …


View Full Document

CMU CS 10701 - EM for HMMs a.k.a. The Baum-Welch Algorithm

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download EM for HMMs a.k.a. The Baum-Welch Algorithm
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view EM for HMMs a.k.a. The Baum-Welch Algorithm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view EM for HMMs a.k.a. The Baum-Welch Algorithm and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?