Purdue CS 59000 - Statistical Machine learning - D2360640

Home> Schools> Purdue University> Computer Sciences (CS) > CS 59000> Statistical Machine learning

DOC PREVIEW

Purdue CS 59000 - Statistical Machine learning

School name Purdue University

Course Cs 59000- Topics in Computer Sciences

Pages 24

This preview shows page 1-2-23-24 out of 24 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Statistical Machine learningLecture 25Yuan (Alan) QiOutline• Review of Hidden Markvo Models, forward-backward algorithm, • EM for learning HMM parameters, • Viterbi Algorithm, Kalman filtering and smoothing• Rejection Sampling, Importance Sampling, Metroplis-hasting algorithm, Gibbs samplingHidden Markov ModelsMany applications, e.g., speech recognition, natural language processing, handwriting recognition, bio-sequence analysisFrom Mixture Models to HMMsBy turning a mixture Model into a dynamic model, we obtain the HMM. Let model the dependence between two consecutive latent variables by a transition probability:HMMsPrior on initial latent variable:Emission probabilities:Joint distribution:Samples from HMM(a) Contours of constant probability density for the emission distributions corresponding to each of the three states of the latent variable. (b) A sample of 50 points drawn from the hidden Markov model, with lines connecting the successive observations.Inference: Forward-backward AlgorithmGoal: compute marginals for latent variables.Forward-backward Algorithm: exact inference as special case of sum-product algorithm on the HMM.Factor graph representation (grouping emission density and transition probability in one factor at a time):Forward-backward Algorithm as Message Passing Method (1)Forward messages:Forward-backward Algorithm as Message Passing Method (2)Backward messages (Q: how to compute it?):The messages actually involves XSimilarly, we can compute the following (Q: why)Rescaling to Avoid OverflowingWhen a sequence is long, the forward message will become to small to be represented by the dynamic range of the computer. We redefine the forward messageasSimilarly, we re-define the backward messageasThen, we can computeSee detailed derivation in textbookViterbi AlgorithmViterbi Algorithm: • Finding the most probable sequence of states• Special case of sum-product algorithm on HMM.What if we want to find the most probable individual states?Maximum Likelihood Estimation for HMMGoal: maximizeLooks familiar? Remember EM for mixture of Gaussians… Indeed the updates are similar.EM for HMME step:Computed from forward-backward/sum-product algorithmM step:Linear Dynamical SystemsEquivalently, we havewhereKalman Filtering and SmoothingInference on linear Gaussian systems.Kalman filtering: sequentially update scaled forward message:Kalman smoothing: sequentially update state beliefs based on scaled forward and backward messages:Learning in LDSEM again…Extension of HMM and LDSDiscrete latent variables: Factorized HMMsContinuous latent variables: switching Kalman filtering modelsSampling MethodsGoal: computeChallenges: we cannot compute the above equation in analytically.Sampling methods: draw random samples such thatImportance Sampling (1)Importance weights:Discussion: what will be an ideal proposal distribution ?Importance Sampling (2)When in is unknown, we havewhereImportance Sampling (3)Sampling and EMM step in EM maximizesWhat if we cannot even evaluate the above integration? One idea: using sampling method: Known as Monte Carlo EM algorithm.Imputation Posterior (IP) Algorithm: Change EM for Bayesian EstimationMarkov Chain Monte CarloGoal: use Markov chains to draw samples from a given distribution.Idea: set up a Markov chain that converges to the target distribution and draw samples from the

View Full Document