DOC PREVIEW
CMU BSC 03711 - Lecture

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computational Genomics and Molecular Biology, Fall 2006 1HMM Lecture Notes Tuesday, October 31Rose Hoberman and Dannie Durand1 OverviewIn the past two weeks, we have discussed algorithms for recognizing new patterns, given an HMMfor a pattern of interest. We now return to the questions of modeling and discovery.PSSM’s can capture the statistical properties of the family, but they have the following limitations:1. They cannot capture positional dependencies.2. They do not handle variable length patterns well.3. They cannot recognize indels in the query sequence.In contrast, HMMs handle gaps and pairwise dependencies well.2 Notation1. N states (S1..SN)2. M symbols in alphabet, Σ3. parameters, λ:1. initial distribution of states π(i)2. transition probabilities aij= P (qt= Si|qt−1= Sj). Note thatPNi=1aij= 1, ∀j3. emission probabilities ei(a) probability state i em its a4. Sequence of symbols: O = O1, O2, ..., OT5. Sequence of states: Q = q1, q2, ..., qT3 Baum WelchIn general, if we have labeled data (that is, we know the state sequence), we can obtain theparameters using Maximum Likelihood Estimation. The Baum-Welch algorithm is used to estimatethe model parameters when the state path is unknown. Given sequences O1, O2, . . ., we wish todetermine λ = {aij, ei(·), πi}. We generally want to choose parameters that will maximize thelikelihood of our data.However, finding a global m aximum is intractable. We would have to enumerate over all parametersets, λk, and then calculateComputational Genomics and Molecular Biology, Fall 2006 2Score(λk) =XdP (Od|λk) =XdXQP (Od|λk, Q)for each λk. Instead, people settle for heuristics which are guaranteed to find at least a localmaximum. Since these are heuristics, evaluation is usually done em pirically by witholding some ofthe training data for testing, but we will not discuss this further.The algorithm (BW) used for selecting the parameter values belongs to a family of algorithmscalled Expectation Maximization (EM) algorithms. They all work by guessing initial parametervalues, then estimating the like lihood of the data under the current parameters. These likelihoodscan then be used to re-estimate the paremeters, iteratively until a local maximum is reached.Setup The alphabet and the number of states, N, are fixed. We are given the observed sequences(denoted Od= Od1, Od2, ...).The intutition behind the algorithm is as follows1. Choose some initial values for λ(π, aij, ei(·)).2. Determine “probable paths” Qd= qd1, qd2, ...3. Count the expected number of transitions, Aij, from s tate i to state j, given the currentestimate of λ.4. Count, Ei(σ), the expected number of times character σ is emitted from state i.5. Re-estimate λ(π, aij, ei(·)) from Aijand Ei(σ),6. if not converged, go to step 2For a given sequence, Od, probability of transiting from state i to j at time t isP (qdt= i, qdt+1= j|Od, λ) =P (qdt= i, qdt+1= j, Od)P (Od)=αt(i)aijej(Odt+1)βt+1(j)P (Od)The term αt(i) is the probability that the model has emitted symbols Od1. . . Odtand is in state Siat time t. This probability can be obtained using the Forward algorithm. Similarly, the Backwardalgorithm yields βt+1(j), the probability of emitting the rest of the sequence if we are in state j attime t + 1. The remaining two terms, aijand ej(Odt+1) give the probability of making the transitionfrom i to j and emitting the t + 1st character.From this we can estimateAij=Xd1P (Od)Xtα(t, i) aijei(Odt+1) β(t + 1, i) (1)Computational Genomics and Molecular Biology, Fall 2006 3The probability of Odcan be estimated using current parameter values using the Forward algorithm.Similarly,Ei(σ) =Xd1P (Od)X{t|Odt=σ}α(t, i) β(t, i). (2)From Aijand Ei(σ) we re-estimate the parameters.Stated formally:Algorithm: Baum WelchInput:A set of observed sequences, O1, O2, . . .Initialization:Select arbitrary model parameters, λ0= aij, ei().score =PdP (Od|λ0).Repeat{λ = λ0, S = S0For each sequence, Od,{/* Calculate ‘‘probable paths’’ Qd= qd1, qd2, ... */Calculate α(t, i) for Odusing the Forward algorithm.Calculate β(t, i) for Odusing the Backward algorithm.Calculate the contribution of Odto A using (1).Calculate the contribution of Odto E using (2).}aij=AijPlAilei(σ) =Ei(σ)PτEi(τ )score =PdP (Od|aij, ei()).}Until (the change in score is less than some predefined threshold.)The estimation of “probable paths” Qd= qd1, qd2, ... in the inner loop is done efficiently using dynamicprogramming. This was not covered in class. This is done using the Forward and Backwardalgorithms. You are not responsible for the details of how the Forward and Backward algorithmsare used in Baum Welch, but you should understand the basic idea of the algorithm and areresponsible for knowing when Baum Welch should be applied.Convergence It can be proven that if current estimate is replaced by these new estimates then thelikelihood of the data will not decrease (i.e. will increase unless already at a local maxima/criticalComputational Genomics and Molecular Biology, Fall 2006 4point). See Durbin, Section 11.6 for discussion of avoiding local maxima and other typical pitfallswith this algorithm.4 Topology• Characteristics: number of nodes, alphabet, which edges to consider. We could just choose afully connected graph, but this has too many paramete rs too estimate.• Instead we can exploit domain knowledge. Choose a topology that limits the number ofstates and edges while still being expressive enough to represe nt the relationships they believeto exist.• The choice of topology can impose a probability distribution on the length of the sequencesthat the HMM recognizes. For example, a simple self loop with probability p results inan exponentially decaying (geometric) distribution P (l residues) = (1 − p)pl−1. There aretopologies that ass ume other length distributions (see Durbin, 3.4 for more on this subject).A basic topology:Suppose we wish to construct an HMM for the WEIRD motif, based on the following alignment whichhas no gaps and no positional dependencies:WEIRDWEIRDWEIREWEIQHWe c an recognize the WEIRD motif using an HMM with this topology:where the transitions probabilities are ai,j= 1 if j = i + 1 and zero, otherwise. The emissionprobabilities are ej(α) = F [α, j], where F [α, j] is the same frequency matrix that we derived forthe PSSM example, using pseudocounts. The Start and End states (M0and M6) are silent. Theabove model is our alternate hypothesis, HA.Computational


View Full Document

CMU BSC 03711 - Lecture

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Homework

Homework

10 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Problem

Problem

6 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?