DOC PREVIEW
Purdue CS 59000 - Statistical Machine learning

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Statistical Machine learningLecture 25Yuan (Alan) QiOutline• Review of Hidden Markvo Models, forward-backward algorithm, • EM for learning HMM parameters, • Viterbi Algorithm, Kalman filtering and smoothing• Rejection Sampling, Importance Sampling, Metroplis-hasting algorithm, Gibbs samplingHidden Markov ModelsMany applications, e.g., speech recognition, natural language processing, handwriting recognition, bio-sequence analysisFrom Mixture Models to HMMsBy turning a mixture Model into a dynamic model, we obtain the HMM. Let model the dependence between two consecutive latent variables by a transition probability:HMMsPrior on initial latent variable:Emission probabilities:Joint distribution:Samples from HMM(a) Contours of constant probability density for the emission distributions corresponding to each of the three states of the latent variable. (b) A sample of 50 points drawn from the hidden Markov model, with lines connecting the successive observations.Inference: Forward-backward AlgorithmGoal: compute marginals for latent variables.Forward-backward Algorithm: exact inference as special case of sum-product algorithm on the HMM.Factor graph representation (grouping emission density and transition probability in one factor at a time):Forward-backward Algorithm as Message Passing Method (1)Forward messages:Forward-backward Algorithm as Message Passing Method (2)Backward messages (Q: how to compute it?):The messages actually involves XSimilarly, we can compute the following (Q: why)Rescaling to Avoid OverflowingWhen a sequence is long, the forward message will become to small to be represented by the dynamic range of the computer. We redefine the forward messageasSimilarly, we re-define the backward messageasThen, we can computeSee detailed derivation in textbookViterbi AlgorithmViterbi Algorithm: • Finding the most probable sequence of states• Special case of sum-product algorithm on HMM.What if we want to find the most probable individual states?Maximum Likelihood Estimation for HMMGoal: maximizeLooks familiar? Remember EM for mixture of Gaussians… Indeed the updates are similar.EM for HMME step:Computed from forward-backward/sum-product algorithmM step:Linear Dynamical SystemsEquivalently, we havewhereKalman Filtering and SmoothingInference on linear Gaussian systems.Kalman filtering: sequentially update scaled forward message:Kalman smoothing: sequentially update state beliefs based on scaled forward and backward messages:Learning in LDSEM again…Extension of HMM and LDSDiscrete latent variables: Factorized HMMsContinuous latent variables: switching Kalman filtering modelsSampling MethodsGoal: computeChallenges: we cannot compute the above equation in analytically.Sampling methods: draw random samples such thatImportance Sampling (1)Importance weights:Discussion: what will be an ideal proposal distribution ?Importance Sampling (2)When in is unknown, we havewhereImportance Sampling (3)Sampling and EMM step in EM maximizesWhat if we cannot even evaluate the above integration? One idea: using sampling method: Known as Monte Carlo EM algorithm.Imputation Posterior (IP) Algorithm: Change EM for Bayesian EstimationMarkov Chain Monte CarloGoal: use Markov chains to draw samples from a given distribution.Idea: set up a Markov chain that converges to the target distribution and draw samples from the


View Full Document

Purdue CS 59000 - Statistical Machine learning

Documents in this Course
Lecture 4

Lecture 4

42 pages

Lecture 6

Lecture 6

38 pages

Load more
Download Statistical Machine learning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Statistical Machine learning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Statistical Machine learning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?