DOC PREVIEW
Stanford CS 262 - Hidden Markov Models

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Hidden Markov ModelsGenerating a sequence by the modelEvaluationThe Forward AlgorithmMotivation for the Backward AlgorithmThe Backward Algorithm – derivationThe Backward AlgorithmComputational ComplexityPosterior DecodingSlide 10Slide 11Viterbi, Forward, BackwardVariants of HMMsHigher-order HMMsSimilar Algorithms to 1st OrderModeling the Duration of StatesSlide 17Solution 1: Chain several statesSolution 2: Negative binomial distributionExample: genes in prokaryotesSolution 3: Duration modelingViterbi with duration modelingProteins, Pair HMMs, and AlignmentA state model for alignmentLet’s score the transitionsAlignment with affine gaps – state versionSlide 27Hidden Markov Models12K…12K…12K…………12K…x1x2x3xK21K2Generating a sequence by the modelGiven a HMM, we can generate a sequence of length n as follows:1. Start at state 1 according to prob a01 2. Emit letter x1 according to prob e1(x1)3. Go to state 2 according to prob a124. … until emitting xn 12K…12K…12K…………12K…x1x2x3xn21K20e2(x1)a02EvaluationWe will develop algorithms that allow us to compute:P(x) Probability of x given the modelP(xi…xj) Probability of a substring of x given the modelP(i = k | x) “Posterior” probability that the ith state is k, given xA more refined measure of which states x may be inThe Forward Algorithmfk(i) = P(x1…xi, i = k) (the forward probability)Initialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fk(i) = ek(xi) l fl(i – 1) alkTermination:P(x) = k fk(N)Motivation for the Backward AlgorithmWe want to computeP(i = k | x),the probability distribution on the ith position, given xWe start by computingP(i = k, x) = P(x1…xi, i = k, xi+1…xN) = P(x1…xi, i = k) P(xi+1…xN | x1…xi, i = k) = P(x1…xi, i = k) P(xi+1…xN | i = k) Then, P(i = k | x) = P(i = k, x) / P(x)Forward, fk(i) Backward, bk(i)The Backward Algorithm – derivationDefine the backward probability:bk(i) = P(xi+1…xN | i = k) “starting from ith state = k, generate rest of x” = i+1…N P(xi+1,xi+2, …, xN, i+1, …, N | i = k) = l i+1…N P(xi+1,xi+2, …, xN, i+1 = l, i+2, …, N | i = k) = l el(xi+1) akl i+1…N P(xi+2, …, xN, i+2, …, N | i+1 = l) = l el(xi+1) akl bl(i+1)The Backward AlgorithmWe can compute bk(i) for all k, i, using dynamic programmingInitialization:bk(N) = 1, for all kIteration:bk(i) = l el(xi+1) akl bl(i+1)Termination:P(x) = l a0l el(x1) bl(1)Computational ComplexityWhat is the running time, and space required, for Forward, and Backward?Time: O(K2N)Space: O(KN)Useful implementation technique to avoid underflowsViterbi: sum of logsForward/Backward: rescaling at each few positions by multiplying by a constantPosterior DecodingWe can now calculatefk(i) bk(i)P(i = k | x) = ––––––– P(x)Then, we can askWhat is the most likely state at position i of sequence x:Define ^ by Posterior Decoding: ^i = argmaxk P(i = k | x) P(i = k | x) = P(i = k , x)/P(x) = P(x1, …, xi, i = k, xi+1, … xn) / P(x) =P(x1, …, xi, i = k) P(xi+1, … xn | i = k) / P(x) =fk(i) bk(i) / P(x)Posterior Decoding•For each state, Posterior Decoding gives us a curve of likelihood of state for each positionThat is sometimes more informative than Viterbi path *•Posterior Decoding may give an invalid sequence of states (of prob 0)Why?Posterior Decoding•P(i = k | x) =  P( | x) 1(i = k) =  {:[i] = k} P( | x)x1 x2 x3 …………………………………………… xNState 1lP(i=l|x)k1() = 1, if  is true 0, otherwiseViterbi, Forward, BackwardVITERBIInitialization:V0(0) = 1Vk(0) = 0, for all k > 0Iteration: Vl(i) = el(xi) maxk Vk(i-1) akl Termination: P(x, *) = maxk Vk(N)FORWARDInitialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fl(i) = el(xi) k fk(i-1) aklTermination:P(x) = k fk(N)BACKWARDInitialization:bk(N) = 1, for all kIteration:bl(i) = k el(xi+1) akl bk(i+1)Termination: P(x) = k a0k ek(x1) bk(1)Variants of HMMsHigher-order HMMs•How do we model “memory” larger than one time point?•P(i+1 = l | i = k) akl•P(i+1 = l | i = k, i -1 = j) ajkl•…•A second order HMM with K states is equivalent to a first order HMM with K2 statesstate H state TaHT(prev = H)aHT(prev = T)aTH(prev = H)aTH(prev = T)state HH state HTstate TH state TTaHHTaTTHaHTTaTHHaTHTaHTHSimilar Algorithms to 1st Order•P(i+1 = l | i = k, i -1 = j)Vlk(i) = maxj{ Vkj(i – 1) + … }Time? Space?Modeling the Duration of StatesLength distribution of region X:E[lX] = 1/(1-p)•Geometric distribution, with mean 1/(1-p)This is a significant disadvantage of HMMsSeveral solutions exist for modeling different length distributionsX Y1-p1-qp qExample: exon lengths in genesSolution 1: Chain several statesX Y1-p1-qpqXXDisadvantage: Still very inflexible lX = C + geometric with mean 1/(1-p)Solution 2: Negative binomial distributionDuration in X: m turns, whereDuring first m – 1 turns, exactly n – 1 arrows to next state are followedDuring mth turn, an arrow to next state is followedm – 1 m – 1P(lX = m) = n – 1 (1 – p)n-1+1p(m-1)-(n-1) = n – 1 (1 – p)npm-nX(n)pX(2)X(1)p1 – p 1 – p p……Y1 – pExample: genes in prokaryotes•EasyGene:Prokaryoticgene-finderLarsen TS, Krogh A•Negative binomial with n = 3Solution 3: Duration modelingUpon entering a state:1. Choose duration d, according to probability distribution2. Generate d letters according to emission probs3. Take a transition to next state according to transition probsDisadvantage: Increase in complexity of Viterbi:Time: O(D)Space: O(1) where D = maximum duration of stateFd<Dfxi…xi+d-1PfWarning, Rabiner’s tutorial claims O(D2) & O(D) increasesViterbi with duration modelingRecall original iteration:Vl(i) = maxk Vk(i – 1) akl  el(xi) New iteration:Vl(i) = maxk maxd=1…Dl Vk(i – d)  Pl(d)  akl  j=i-d+1…iel(xj)F Ltransitionsemissionsd<Dfxi…xi + d – 1emissionsd<Dlxj…xj + d – 1PfPlPrecompute cumulative valuesProteins, Pair HMMs, and AlignmentA state model for alignment-AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACCIMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIIIM(+1,+1)I(+1, 0)J(0, +1)Alignments correspond 1-to-1 with sequences of states


View Full Document

Stanford CS 262 - Hidden Markov Models

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Hidden Markov Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hidden Markov Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hidden Markov Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?