DOC PREVIEW
Stanford CS 262 - Hidden Markov Models

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Hidden Markov ModelsDefinition of a hidden Markov modelThe three main questions on HMMsTodayProblem 1: DecodingDecodingDecoding – main ideaThe Viterbi AlgorithmSlide 9Viterbi Algorithm – a practical detailExampleProblem 2: EvaluationGenerating a sequence by the modelA couple of questionsEvaluationThe Forward AlgorithmThe Forward Algorithm – derivationSlide 18Relation between Forward and ViterbiMotivation for the Backward AlgorithmThe Backward Algorithm – derivationThe Backward AlgorithmComputational ComplexityPosterior DecodingSlide 25Maximum Weight TraceA modeling ExampleExample: CpG IslandsSlide 29A model of CpG Islands – (1) ArchitectureHidden Markov ModelsLecture 5, Tuesday April 15, 2003Definition of a hidden Markov modelDefinition: A hidden Markov model (HMM)•Alphabet  = { b1, b2, …, bM }•Set of states Q = { 1, ..., K }•Transition probabilities between any two statesaij = transition prob from state i to state jai1 + … + aiK = 1, for all states i = 1…K•Start probabilities a0ia01 + … + a0K = 1•Emission probabilities within each stateei(b) = P( xi = b | i = k)ei(b1) + … + ei(bM) = 1, for all states i = 1…KK1…2The three main questions on HMMs1. EvaluationGIVEN a HMM M, and a sequence x,FIND Prob[ x | M ]2. DecodingGIVEN a HMM M, and a sequence x,FIND the sequence  of states that maximizes P[ x,  | M ]3. LearningGIVEN a HMM M, with unspecified transition/emission probs.,and a sequence x,FIND parameters  = (ei(.), aij) that maximize P[ x |  ]Today•Decoding •EvaluationProblem 1: DecodingFind the best parse of a sequenceDecodingGIVEN x = x1x2……xNWe want to find  = 1, ……, N,such that P[ x,  ] is maximized* = argmax P[ x,  ]We can use dynamic programming!Let Vk(i) = max{1,…,i-1} P[x1…xi-1, 1, …, i-1, xi, i = k] = Probability of most likely sequence of states ending at state i = k12K…12K…12K…………12K…x1x2x3xK21K2Decoding – main ideaGiven that for all states k, and for a fixed position i, Vk(i) = max{1,…,i-1} P[x1…xi-1, 1, …, i-1, xi, i = k]What is Vk(i+1)?From definition, Vl(i+1) = max{1,…,i}P[ x1…xi, 1, …, i, xi+1, i+1 = l ] = max{1,…,i}P(xi+1, i+1 = l | x1…xi,1,…, i) P[x1…xi, 1,…, i] = max{1,…,i}P(xi+1, i+1 = l | i ) P[x1…xi-1, 1, …, i-1, xi, i] = maxk P(xi+1, i+1 = l | i = k) max{1,…,i-1}P[x1…xi-1,1,…,i-1, xi,i=k] = el(xi+1) maxk akl Vk(i)The Viterbi AlgorithmInput: x = x1……xNInitialization:V0(0) = 1 (0 is the imaginary first position)Vk(0) = 0, for all k > 0Iteration:Vj(i) = ej(xi)  maxk akj Vk(i-1)Ptrj(i) = argmaxk akj Vk(i-1)Termination:P(x, *) = maxk Vk(N)Traceback: N* = argmaxk Vk(N) i-1* = Ptri (i)The Viterbi AlgorithmSimilar to “aligning” a set of states to a sequenceTime:O(K2N)Space:O(KN)x1 x2 x3 ………………………………………..xNState 12KVj(i)Viterbi Algorithm – a practical detailUnderflows are a significant problemP[ x1,…., xi, 1, …, i ] = a01 a12……ai e1(x1)……ei(xi)These numbers become extremely small – underflow Solution: Take the logs of all valuesVl(i) = log ek(xi) + maxk [ Vk(i-1) + log akl ]ExampleLet x be a sequence with a portion of ~ 1/6 6’s, followed by a portion of ~ ½ 6’s…x = 123456123456…12345 6626364656…1626364656Then, it is not hard to show that optimal parse is (exercise): FFF…………………...F LLL………………………...L6 nucleotides “123456” parsed as F, contribute .956(1/6)6 = 1.610-5 parsed as L, contribute .956(1/2)1(1/10)5 = 0.410-5 “162636” parsed as F, contribute .956(1/6)6 = 1.610-5 parsed as L, contribute .956(1/2)3(1/10)3 = 9.010-5Problem 2: EvaluationFind the likelihood a sequence is generated by the modelGenerating a sequence by the modelGiven a HMM, we can generate a sequence of length n as follows:1. Start at state 1 according to prob a01 2. Emit letter x1 according to prob e1(x1)3. Go to state 2 according to prob a124. … until emitting xn 12K…12K…12K…………12K…x1x2x3xn21K20e2(x1)a02A couple of questionsGiven a sequence x,•What is the probability that x was generated by the model?•Given a position i, what is the most likely state that emitted xi?Example: the dishonest casinoSay x = 12341623162616364616234161221341Most likely path:  = FF……FHowever: marked letters more likely to be L than unmarked lettersEvaluationWe will develop algorithms that allow us to compute:P(x) Probability of x given the modelP(xi…xj) Probability of a substring of x given the modelP(I = k | x) Probability that the ith state is k, given xA more refined measure of which states x may be inThe Forward AlgorithmWe want to calculateP(x) = probability of x, given the HMMSum over all possible ways of generating x:P(x) =  P(x, ) =  P(x | ) P() To avoid summing over an exponential number of paths , define fk(i) = P(x1…xi, i = k) (the forward probability)The Forward Algorithm – derivationDefine the forward probability:fl(i) = P(x1…xi, i = l) = 1…i-1 P(x1…xi-1, 1,…, i-1, i = l) el(xi) = k 1…i-2 P(x1…xi-1, 1,…, i-2, i-1 = k) akl el(xi) = el(xi) k fk(i-1) aklThe Forward AlgorithmWe can compute fk(i) for all k, i, using dynamic programming!Initialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fl(i) = el(xi) k fk(i-1) aklTermination:P(x) = k fk(N) ak0Where, ak0 is the probability that the terminating state is k (usually = a0k)Relation between Forward and ViterbiVITERBIInitialization:V0(0) = 1Vk(0) = 0, for all k > 0Iteration:Vj(i) = ej(xi) maxk Vk(i-1) akj Termination:P(x, *) = maxk Vk(N)FORWARDInitialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fl(i) = el(xi) k fk(i-1) aklTermination:P(x) = k fk(N) ak0Motivation for the Backward AlgorithmWe want to computeP(i = k | x),the probability distribution on the ith position, given xWe start by computingP(i = k, x) = P(x1…xi, i = k, xi+1…xN) = P(x1…xi, i = k) P(xi+1…xN | x1…xi, i = k) = P(x1…xi, i = k) P(xi+1…xN | i = k) Forward, fk(i) Backward, bk(i)The Backward Algorithm – derivationDefine the backward probability:bk(i) = P(xi+1…xN | i = k) =


View Full Document

Stanford CS 262 - Hidden Markov Models

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Hidden Markov Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hidden Markov Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hidden Markov Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?