DOC PREVIEW
Stanford CS 262 - Hidden Markov Models

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Hidden Markov ModelsReview of Last LectureTime WarpingSlide 4Definition of a hidden Markov modelThe three main questions on HMMsTodayProblem 1: DecodingDecodingDecoding – main ideaThe Viterbi AlgorithmSlide 12Viterbi Algorithm – a practical detailExampleProblem 2: EvaluationGenerating a sequence by the modelA couple of questionsEvaluationThe Forward AlgorithmThe Forward Algorithm – derivationSlide 21Relation between Forward and ViterbiMotivation for the Backward AlgorithmThe Backward Algorithm – derivationThe Backward AlgorithmComputational ComplexityPosterior DecodingSlide 28A modeling ExampleExample: CpG IslandsSlide 31A model of CpG Islands – (1) ArchitectureHidden Markov ModelsLecture 5, Tuesday April 15, 2003Review of Last LectureLecture 2, Thursday April 3, 2003Lecture 5, Tuesday April 15, 2003Time WarpingDefinition: (u), (u) are connected by an approximate continuous time warping (u0, v0), if: u0, v0 are strictly increasing functions on [0, T], and (u0(t))  (v0(t)) for 0  t  T (t)(t)0Tu0(t)v0(t)Lecture 5, Tuesday April 15, 2003Time Warpingvu001 212MNDefine possible steps:(u, v) is the possible difference of u and vbetween steps h-1 and h (1, 0)(u, v) = (1, 1) (0, 1)Lecture 5, Tuesday April 15, 2003Definition of a hidden Markov modelDefinition: A hidden Markov model (HMM)•Alphabet  = { b1, b2, …, bM }•Set of states Q = { 1, ..., K }•Transition probabilities between any two statesaij = transition prob from state i to state jai1 + … + aiK = 1, for all states i = 1…K•Start probabilities a0ia01 + … + a0K = 1•Emission probabilities within each stateei(b) = P( xi = b | i = k)ei(b1) + … + ei(bM) = 1, for all states i = 1…KK1…2Lecture 5, Tuesday April 15, 2003The three main questions on HMMs1. EvaluationGIVEN a HMM M, and a sequence x,FIND Prob[ x | M ]2. DecodingGIVEN a HMM M, and a sequence x,FIND the sequence  of states that maximizes P[ x,  | M ]3. LearningGIVEN a HMM M, with unspecified transition/emission probs.,and a sequence x,FIND parameters  = (ei(.), aij) that maximize P[ x |  ]Lecture 5, Tuesday April 15, 2003Today•Decoding •EvaluationProblem 1: DecodingFind the best parse of a sequenceLecture 5, Tuesday April 15, 2003DecodingGIVEN x = x1x2……xNWe want to find  = 1, ……, N,such that P[ x,  ] is maximized* = argmax P[ x,  ]We can use dynamic programming!Let Vk(i) = max{1,…,i-1} P[x1…xi-1, 1, …, i-1, xi, i = k] = Probability of most likely sequence of states ending at state i = k12K…12K…12K…………12K…x1x2x3xK21K2Lecture 5, Tuesday April 15, 2003Decoding – main ideaGiven that for all states k, and for a fixed position i, Vk(i) = max{1,…,i-1} P[x1…xi-1, 1, …, i-1, xi, i = k]What is Vk(i+1)?From definition, Vl(i+1) = max{1,…,i}P[ x1…xi, 1, …, i, xi+1, i+1 = l ] = max{1,…,i}P(xi+1, i+1 = l | x1…xi,1,…, i) P[x1…xi, 1,…, i] = max{1,…,i}P(xi+1, i+1 = l | i ) P[x1…xi-1, 1, …, i-1, xi, i] = maxk P(xi+1, i+1 = l | i = k) max{1,…,i-1}P[x1…xi-1,1,…,i-1, xi,i=k] = el(xi+1) maxk akl Vk(i)Lecture 5, Tuesday April 15, 2003The Viterbi AlgorithmInput: x = x1……xNInitialization:V0(0) = 1 (0 is the imaginary first position)Vk(0) = 0, for all k > 0Iteration:Vj(i) = ej(xi)  maxk akj Vk(i-1)Ptrj(i) = argmaxk akj Vk(i-1)Termination:P(x, *) = maxk Vk(N)Traceback: N* = argmaxk Vk(N) i-1* = Ptri (i)Lecture 5, Tuesday April 15, 2003The Viterbi AlgorithmSimilar to “aligning” a set of states to a sequenceTime:O(K2N)Space:O(KN)x1 x2 x3 ………………………………………..xNState 12KVj(i)Lecture 5, Tuesday April 15, 2003Viterbi Algorithm – a practical detailUnderflows are a significant problemP[ x1,…., xi, 1, …, i ] = a01 a12……ai e1(x1)……ei(xi)These numbers become extremely small – underflow Solution: Take the logs of all valuesVl(i) = log ek(xi) + maxk [ Vk(i-1) + log akl ]Lecture 5, Tuesday April 15, 2003ExampleLet x be a sequence with a portion of ~ 1/6 6’s, followed by a portion of ~ ½ 6’s…x = 123456123456…12345 6626364656…1626364656Then, it is not hard to show that optimal parse is (exercise): FFF…………………...F LLL………………………...L6 nucleotides “123456” parsed as F, contribute .956(1/6)6 = 1.610-5 parsed as L, contribute .956(1/2)1(1/10)5 = 0.410-5 “162636” parsed as F, contribute .956(1/6)6 = 1.610-5 parsed as L, contribute .956(1/2)3(1/10)3 = 9.010-5Problem 2: EvaluationFind the likelihood a sequence is generated by the modelLecture 5, Tuesday April 15, 2003Generating a sequence by the modelGiven a HMM, we can generate a sequence of length n as follows:1. Start at state 1 according to prob a01 2. Emit letter x1 according to prob e1(x1)3. Go to state 2 according to prob a124. … until emitting xn 12K…12K…12K…………12K…x1x2x3xn21K20e2(x1)a02Lecture 5, Tuesday April 15, 2003A couple of questionsGiven a sequence x,•What is the probability that x was generated by the model?•Given a position i, what is the most likely state that emitted xi?Example: the dishonest casinoSay x = 12341623162616364616234161221341Most likely path:  = FF……FHowever: marked letters more likely to be L than unmarked lettersLecture 5, Tuesday April 15, 2003EvaluationWe will develop algorithms that allow us to compute:P(x) Probability of x given the modelP(xi…xj) Probability of a substring of x given the modelP(I = k | x) Probability that the ith state is k, given xA more refined measure of which states x may be inLecture 5, Tuesday April 15, 2003The Forward AlgorithmWe want to calculateP(x) = probability of x, given the HMMSum over all possible ways of generating x:P(x) =   P(x, ) =  P(x | ) P() To avoid summing over an exponential number of paths , define fk(i) = P(x1…xi, i = k) (the forward probability)Lecture 5, Tuesday April 15, 2003The Forward Algorithm – derivationDefine the forward probability:fl(i) = P(x1…xi, i = l) = 1…i-1 P(x1…xi-1, 1,…, i-1, i = l) el(xi) = k 1…i-2 P(x1…xi-1, 1,…, i-2, i-1 = k) akl el(xi) = el(xi) k fk(i-1) aklLecture 5, Tuesday April 15, 2003The Forward AlgorithmWe can compute fk(i)


View Full Document

Stanford CS 262 - Hidden Markov Models

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Hidden Markov Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hidden Markov Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hidden Markov Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?