DOC PREVIEW
CMU BSC 03711 - Homework

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

03-511/711 Computational Genomics and Molecular Biology, Fall 2010 1Problem Set 2b Due October 13th, 20111. Define an HMM H with three states {A, B, C} and alphabet {0, 1, 2} and the following tran-sition and emission probabilities:A B C 0 1 2A 0.2 0.8 0.0 0.8 0.2 0.0B 0.0 0.8 0.2 0.0 0.6 0.4C 0.4 0.0 0.6 0.2 0.0 0.8(a) Draw the state diagram of this HMM and show the transition probabilities.(b) Assuming that initial state is A, give all of the possible state paths for the sequenceO = 0, 1, 2, 0.03-511/711 Computational Genomics and Molecular Biology, Fall 2010 2(c) For each possible state path, Qi, calculate the probability of emitting O from the statesin that path. What is the total probability of the sequence, P (O)?(d) What is the most probable path, Q∗? What is the probability of emitting O via themost probable path?(e) The Forward algorithm calculates P (O), the probability that the model will emit a givensequence, O, over all possible paths. One might consider approximating this probabilityby calculating P (O |Q∗) using the Viterbi algorithm. For this particular HMM, wouldP (O|Q∗) be a goo d approximation of P (O)? Explain your reasoning.03-511/711 Computational Genomics and Molecular Biology, Fall 2010 32. Consider the following HMM that emits sequences, drawn from a letter alphabet, Σ = {H, L},that participate in coiled coil structures. All paths begin in the initial state, Si, andend in the termination state, St. Note that thes e are not silent states. The transitionand emission probabilities for this HMM are:Transition probabilities Emission probabilitiesSiA B C D E F G StH LSi0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5A 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.1B 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.2 0.8C 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.2 0.8D 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.9 0.1E 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.2 0.8F 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.2 0.8G 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.2 0.8St0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.5 0.5Coiled coils are protein secondary structures formed by two intertwined alpha helices. Se-quences that participate in coiled coils have a characteristic seven-fold repeat. The sevenpositions in this patter are typically denoted by the letters A-G. The residues in the A andD p os itions are at the interface of the two helices and, hence, are hydrophobic. The residuesin the B, C, E, F and G positions are exposed and typically hydrophilic.(a) Draw the state diagram for this HMM. Label the transitions with their probabilities.03-511/711 Computational Genomics and Molecular Biology, Fall 2010 4(b) What is the length of the shortest sequence that this HMM can generate? What is thethe length of the longest longest sequence that this HMM can recognize?(c) Coiled coils sequences typically have two or more consecutive copies of the seven-foldrepeat. How does the probability of coiled coil sequences generated by this HMM varywith copy numb e r? Are sequences with two copies of the s even-fold repeat more likelythan sequences with just one copy?(d) Coiled coils sometimes contain an “offset,” i.e., insertions of a few amino acids betweencopies of the seven-fold repeat. Can this HMM emit coiled coil sequences with an offset?Explain your answer. If not, how would you modify the HMM to emit sequences withan offset between the copies?03-511/711 Computational Genomics and Molecular Biology, Fall 2010 53. There are three stop codons: TAG, TAA, and TGA. In the extremophile bacterium Deinococ-cus radiodurans, the relative frequency of these stop codons is 5%, 12%, and 83%. Constructan HMM that emits these stop codons, and only these codons, using the smallest number ofstates possible. (Hint: the m inimum number of states re quired is less than nine.)Give the topology of model and the initial, transition, and emission probabilities. Your modelshould emit the three alternate stop codons in the correct frequencies. There is more thanone solution; just give one.03-511/711 Computational Genomics and Molecular Biology, Fall 2010 64. Consider the following three-state HMM for transmembrane proteins:The parameters for this model arei E M Cπi0 0 1ei(H) 0.2 0.9 0.3ei(L) 0.8 0.1 0.7(a) Complete the partial calculation on the next page by by filling in the blank boxes in thematrix shown.(b) What algorithm is being used to generate this matrix?(c) What is, Q∗(O), the most probable path given O = LHHL?(d) Give the most probable state for each time step, t = 0, 1, 2, 3, 4(e) For how many time steps is the state on the most probable path also the most probablestate. For which times, if any, are the most probable state and the s tate on the mostprobable path not the same?π( C) e( C) π( M) e( M) π( E) e( E)L t=0 1 0.7 0.7000 0 0.1 0.0000 0 0.8 0.0000j δ(t,j) a(j,C) e( C) δ(t,j) a(j,M) e( M) δ(t,j) a(j,E) e( E)H t=1 C 0.7000 0.85 0.3 0.1785 0.7000 0.15 0.9 0.0945 0.7000 0 0.2 0.0000M 0 0.2 0.3 0.0000 0 0.6 0.9 0.0000 0 0.2 0.2 0.0000E 0 0 0.3 0.0000 0 0.25 0.9 0.0000 0 0.75 0.2 0.00000.1785 0.0945 0.0000H t=2 C 0.1785 0.85 0.3 0.0455 0.1785 0.15 0.9 0.0241 0.1785 0 0.2 0.0000M 0.0945 0.2 0.3 0.0057 0.0945 0.6 0.9 0.0510 0.0945 0.2 0.2 0.0038E 0.0000 0 0.3 0.0000 0.0000 0.25 0.9 0.0000 0.0000 0.75 0.2 0.00000.0455 0.0510 0.0038L t=3 C 0.85 0.7 0.15 0.1 0 0.8M 0.2 0.7 0.6 0.1 0.2 0.8E 0 0.7 0.25 0.1 0.75 0.8Final probability:CM E03-511/711 Computational Genomics and Molecular Biology, Fall 2010 75. Consider the same TM-HMM as in the previous question:with model parameters:i E M Cπi0 0 1ei(H) 0.2 0.9 0.3ei(L) 0.8 0.1 0.7(a) Calculate P (O|TM-HMM), the total probability of the sequence O = LHHL with re-spect to the transmembrane model. Use the worksheet on the next page. What isP (LHHL|TM-HMM)?(b) What algorithm did you use to generate this matrix?π( C) e( C) π( M) e( M) π( E) e( E)L t=0j α(t-1,j) a(j,C) e( C) α(t-1,j) a(j,M) e( M) α(t-1,j) a(j,E) e( E)H t=1 CMEα(t,j) α(t,j) α(t,j)H t=2 CMEL t=3 CMEFinal probability:CM


View Full Document

CMU BSC 03711 - Homework

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Homework

Homework

10 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Problem

Problem

6 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

10 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Homework
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?