Stanford CS 262 - Hidden Markov Models - D549973

Home> Schools> Stanford University> Computer Science (CS) > CS 262> Hidden Markov Models

DOC PREVIEW

Stanford CS 262 - Hidden Markov Models

School name Stanford University

Course Cs 262- Computational Genomics

Pages 27

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 27 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Hidden Markov ModelsGenerating a sequence by the modelEvaluationThe Forward AlgorithmMotivation for the Backward AlgorithmThe Backward Algorithm – derivationThe Backward AlgorithmComputational ComplexityPosterior DecodingSlide 10Slide 11Viterbi, Forward, BackwardVariants of HMMsHigher-order HMMsSimilar Algorithms to 1st OrderModeling the Duration of StatesSlide 17Solution 1: Chain several statesSolution 2: Negative binomial distributionExample: genes in prokaryotesSolution 3: Duration modelingViterbi with duration modelingProteins, Pair HMMs, and AlignmentA state model for alignmentLet’s score the transitionsAlignment with affine gaps – state versionSlide 27Hidden Markov Models12K…12K…12K…………12K…x1x2x3xK21K2Generating a sequence by the modelGiven a HMM, we can generate a sequence of length n as follows:1. Start at state 1 according to prob a01 2. Emit letter x1 according to prob e1(x1)3. Go to state 2 according to prob a124. … until emitting xn 12K…12K…12K…………12K…x1x2x3xn21K20e2(x1)a02EvaluationWe will develop algorithms that allow us to compute:P(x) Probability of x given the modelP(xi…xj) Probability of a substring of x given the modelP(i = k | x) “Posterior” probability that the ith state is k, given xA more refined measure of which states x may be inThe Forward Algorithmfk(i) = P(x1…xi, i = k) (the forward probability)Initialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fk(i) = ek(xi) l fl(i – 1) alkTermination:P(x) = k fk(N)Motivation for the Backward AlgorithmWe want to computeP(i = k | x),the probability distribution on the ith position, given xWe start by computingP(i = k, x) = P(x1…xi, i = k, xi+1…xN) = P(x1…xi, i = k) P(xi+1…xN | x1…xi, i = k) = P(x1…xi, i = k) P(xi+1…xN | i = k) Then, P(i = k | x) = P(i = k, x) / P(x)Forward, fk(i) Backward, bk(i)The Backward Algorithm – derivationDefine the backward probability:bk(i) = P(xi+1…xN | i = k) “starting from ith state = k, generate rest of x” = i+1…N P(xi+1,xi+2, …, xN, i+1, …, N | i = k) = l i+1…N P(xi+1,xi+2, …, xN, i+1 = l, i+2, …, N | i = k) = l el(xi+1) akl i+1…N P(xi+2, …, xN, i+2, …, N | i+1 = l) = l el(xi+1) akl bl(i+1)The Backward AlgorithmWe can compute bk(i) for all k, i, using dynamic programmingInitialization:bk(N) = 1, for all kIteration:bk(i) = l el(xi+1) akl bl(i+1)Termination:P(x) = l a0l el(x1) bl(1)Computational ComplexityWhat is the running time, and space required, for Forward, and Backward?Time: O(K2N)Space: O(KN)Useful implementation technique to avoid underflowsViterbi: sum of logsForward/Backward: rescaling at each few positions by multiplying by a constantPosterior DecodingWe can now calculatefk(i) bk(i)P(i = k | x) = ––––––– P(x)Then, we can askWhat is the most likely state at position i of sequence x:Define ^ by Posterior Decoding: ^i = argmaxk P(i = k | x) P(i = k | x) = P(i = k , x)/P(x) = P(x1, …, xi, i = k, xi+1, … xn) / P(x) =P(x1, …, xi, i = k) P(xi+1, … xn | i = k) / P(x) =fk(i) bk(i) / P(x)Posterior Decoding•For each state, Posterior Decoding gives us a curve of likelihood of state for each positionThat is sometimes more informative than Viterbi path *•Posterior Decoding may give an invalid sequence of states (of prob 0)Why?Posterior Decoding•P(i = k | x) =  P( | x) 1(i = k) =  {:[i] = k} P( | x)x1 x2 x3 …………………………………………… xNState 1lP(i=l|x)k1() = 1, if  is true 0, otherwiseViterbi, Forward, BackwardVITERBIInitialization:V0(0) = 1Vk(0) = 0, for all k > 0Iteration: Vl(i) = el(xi) maxk Vk(i-1) akl Termination: P(x, *) = maxk Vk(N)FORWARDInitialization:f0(0) = 1fk(0) = 0, for all k > 0Iteration:fl(i) = el(xi) k fk(i-1) aklTermination:P(x) = k fk(N)BACKWARDInitialization:bk(N) = 1, for all kIteration:bl(i) = k el(xi+1) akl bk(i+1)Termination: P(x) = k a0k ek(x1) bk(1)Variants of HMMsHigher-order HMMs•How do we model “memory” larger than one time point?•P(i+1 = l | i = k) akl•P(i+1 = l | i = k, i -1 = j) ajkl•…•A second order HMM with K states is equivalent to a first order HMM with K2 statesstate H state TaHT(prev = H)aHT(prev = T)aTH(prev = H)aTH(prev = T)state HH state HTstate TH state TTaHHTaTTHaHTTaTHHaTHTaHTHSimilar Algorithms to 1st Order•P(i+1 = l | i = k, i -1 = j)Vlk(i) = maxj{ Vkj(i – 1) + … }Time? Space?Modeling the Duration of StatesLength distribution of region X:E[lX] = 1/(1-p)•Geometric distribution, with mean 1/(1-p)This is a significant disadvantage of HMMsSeveral solutions exist for modeling different length distributionsX Y1-p1-qp qExample: exon lengths in genesSolution 1: Chain several statesX Y1-p1-qpqXXDisadvantage: Still very inflexible lX = C + geometric with mean 1/(1-p)Solution 2: Negative binomial distributionDuration in X: m turns, whereDuring first m – 1 turns, exactly n – 1 arrows to next state are followedDuring mth turn, an arrow to next state is followedm – 1 m – 1P(lX = m) = n – 1 (1 – p)n-1+1p(m-1)-(n-1) = n – 1 (1 – p)npm-nX(n)pX(2)X(1)p1 – p 1 – p p……Y1 – pExample: genes in prokaryotes•EasyGene:Prokaryoticgene-finderLarsen TS, Krogh A•Negative binomial with n = 3Solution 3: Duration modelingUpon entering a state:1. Choose duration d, according to probability distribution2. Generate d letters according to emission probs3. Take a transition to next state according to transition probsDisadvantage: Increase in complexity of Viterbi:Time: O(D)Space: O(1) where D = maximum duration of stateFd<Dfxi…xi+d-1PfWarning, Rabiner’s tutorial claims O(D2) & O(D) increasesViterbi with duration modelingRecall original iteration:Vl(i) = maxk Vk(i – 1) akl  el(xi) New iteration:Vl(i) = maxk maxd=1…Dl Vk(i – d)  Pl(d)  akl  j=i-d+1…iel(xj)F Ltransitionsemissionsd<Dfxi…xi + d – 1emissionsd<Dlxj…xj + d – 1PfPlPrecompute cumulative valuesProteins, Pair HMMs, and AlignmentA state model for alignment-AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACCIMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIIIM(+1,+1)I(+1, 0)J(0, +1)Alignments correspond 1-to-1 with sequences of states

View Full Document