Computational Genomics and Molecular Biology, Fall 2009 1HMM Lecture Notes Tuesday, October 29thDannie Durand and Rose Hoberman1 Notation1. N states (S1..SN)2. M symbols in alphabet, Σ3. parameters, λ:1. initial distribution of states π(i)2. transition probabilities aij= P (qt= Si|qt−1= Sj). Note thatPNi=1aij= 1, ∀j3. emission probabilities ei(a) probability state i emits a4. Sequence of symbols: O = O1, O2, ..., OT5. Sequence of states: Q = q1, q2, ..., qT2 Using HMM’s for recogition2.1 Questions to ask with an HMM1. Given a sequence O, what is the true path? Otherwise stated, we wish to assign labels to anunlabeled sequence.Example: Identify the cytosolic, transmembrane, and extracellular regions in the sequence.In this case, we wish to assign the labels E, M, or C to the unlabeled data.2. What is the probability that a given sequence O, was generated by the HMM?Example: Is the sequence a transmembrane protein?3. What is the probability of being in state i when Otis emitted?Example: Is a given residue localized to the membrane?4. Use the HMM to simulate sequences.Example: Generate sequences with properties similar to real transmembrane sequences.In the previous lecture, we started discussing algorithms for answering these questions.1. We assume that most likely path, Q∗, is a good estimation of the true path, where Q∗isdefined byargmaxQP (Q|O) = argmaxQP (Q, O)P (O)= argmaxQP (Q, O).On Tuesday, we discussed how to find Q∗using the Viterbi algorithm. This pro cess is called“decoding” because we decode the seq uence of symbols to determine the hidden sequenceComputational Genomics and Molecular Biology, Fall 2009 2of states. In speech recognition, recorded speech is “decoded” into words or phonemes todetermine the meaning of the utterance.2. The total probability of O is the sum of the probability of O over all possible paths throughthe HMM:P (O) =XQP (O|Q)P (Q) =XQP (O, Q)To solve this we use the Forward algorithm, which interatively calculates the probabilityof being in state Siafter generating the sequence up to observation Ot. We designate thisquantity:αt(i) = P (O1, O2, O3, ...Ot, qt= Si)Algorithm: ForwardInitialization:α1(i) = πiei(O1)Iteration:αt(i) =NXj=1αt−1(j) ∗ aji∗ ei(Ot)The probability of observing the entire sequence is given by the sum over all possible finalstates:P (O) =NXi=1αT(i)3. We wish to determine P (qt= Si|Ot), the probability of being in state Siwhen Otis emitted.This is equivalent to the emitting O1. . . Ot−1over any path, entering Si, emitting Ot, andthen emitting Ot+1. . . OTover any path orP (qt= Si|Ot) = P (O1, O2, O3, ...Ot, qt= Si)P (Ot+1, Ot+2, ...OT|qt= Si)Note that the first term is just αt(i). We calculate the second term with the Backwardalgorithm, described in the next section. The Backward algorithm can also be used insteadof the Forward algorithm to calculate P (O) and it is used in the Baum-Welch algorithm forestimating the parameters of the model.Computational Genomics and Molecular Biology, Fall 2009 32.2 Backward algorithmAbove we calculated forward in time: α1(·), α2(, ·), ..., αT(, ·). It is also possible to calculate theprobability of emmitting O given a particular model by working backward from the end of thesequence. Let βt(i) be the probability of generating the last T − t + 1 observations given that Ot−1was emitted from state i:βt(i) = P (Ot, Ot+1, Ot+2, ...OT|qt−1= Si)Algorithm: BackwardInitialization:βT(i) =NXj=1aij∗ ej(OT)Iteration:βt(i) =NXj=1aij∗ ej(Ot) ∗ βt+1(j)To determine P (qt= Si|Ot), we work backwards from OTto Ot+1, to obtainP (qt= Si|Ot) = αt(i) · βt+1(i)To calculate the probability of the entire sequence, we start with T and go all the way to 2,calculating βt(i).P (O) =NXj=1πjej(O1)β2(j)Determining the most probable state for every symbol Otis an alternative to Viterbi decoding.This could give better results in some cases, such as when sub optimal paths are almost as probableas the most probable path.Either the Forward or the Backward algorithm can be used to determine the probability of asequence, O. Both are needed in order to learn parameters from unlabeled data using the BaumWelch
View Full Document