DOC PREVIEW
CMU BSC 03711 - Homework

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fall 2011 Computational Genomics and Molecular Biology 1Problem Set 3aCollaboration is allowed on this homework. You must hand in homeworks individually and list the namesof the people you worked with. Homework must be submitted by 5pm in MI646 or electronically to [email protected] Tuesday, Nov 22nd at 5pm1. Supp ose you use a profile HMM to obtain a global alignment of these five amino acid sequences:SSTG, CSVSL, SNRQT, SCLGN, ASSTHYou construct an HMM and infer the parameters of the model from the data. You then use the modelto label the data and obtain an alignment. The paths of the five sequences through through the ProfileHMM arem0m1m2m3m4d5m6m0i0m1d2m3m4m5m6m0m1d2m3m4m5i5m6m0m1m2m3m4m5m6m0i0m1d2m3m4m5m6States m0and m6are the start and end states, respectively, and do not emit characters.(a) How many match states (not including m0and m6) should the Profile HMM have for thesesequences? Explain your reasoning.(b) What algorithm was used to learn the weights of the profile HMM from the input sequences?(c) What algorithm was used to determine the path through the HMM for each protein?Fall 2011 Computational Genomics and Molecular Biology 2(d) Give the alignment of the sequences that these paths determine.(e) Based on your alignment, would you perform “mo del surgery” on this profile HMM? If so, whichstates would you add and/or delete and why?Fall 2011 Computational Genomics and Molecular Biology 32. (a) Verify that the rows of the PAM 1 transition matrix sum to one.(b) Verify thatPipiP1[i, i] = 0.99Fall 2011 Computational Genomics and Molecular Biology 43. In this problem, you will construct a BLOSUM60 substitution matrix from the following aligned block:1: DSDQQD2: DSSQQD3: SSQQDD4: DDQQDD(a) Determine the percent identity b etwee n all poss ible pairs of sequence s.(b) Cluster the se quences such that each sequence in the cluster is at least 60% identical to someother sequence in the cluster.Fall 2011 Computational Genomics and Molecular Biology 5(c) Calculate the observed frequencies (axy) for the clustered sequences, using the BLOSUM methodfor adjusting for cluster size.(d) Calculate the expected frequencies (axy) for the clustered sequences, using the BLOSUM methodfor adjusting for cluster size.(e) Use these frequencies to obtain the log odds matrix, as defined in the BLOSUM framework.Fall 2011 Computational Genomics and Molecular Biology 64. Substitution matrices:(a) Both the PAM and the BLOSUM substitution matrix families are parametrized by evolutionarydivergence. Which represents a greater degree of divergence, BLOSUM80 or BLOSUM62? Why?(b) Which represents a greater degree of divergence, BLOSUM62 or PAM40? Why?(c) What is the interpretation of a positive value in Sx[i, j], the PAM x log odds scoring matrix fora given pair of amino acids i, j?(d) What is the interpretation of a negative value in Sx[i, j]?Fall 2011 Computational Genomics and Molecular Biology 75. Substitution matrices and evolutionary divergence(a) Consider the PAM30 and PAM250 matrices (shown on the web site). What is the average valueon the diagonal of the PAM 30 matrix (i.e., the average of S30[i, i] over all values of i)?(b) What is the average value on the diagonal of the PAM 250 matrix?(c) Which average diagonal value is larger? How would you explain this in terms of the evolutionarydivergence associated with each of the matrices ?(d) Which specific diagonal values are larger in PAM250 than in PAM30? That is, for which aminoacids, i, is S250[i, i] > S30[i, i]? What does that suggest about the functional or structuralproperties of i?Fall 2011 Computational Genomics and Molecular Biology 86. Serine and threonine (S and T) are small, hydrophilic amino acids; asparagine, aspartic acid, glutamicacid, and glutamine (N, D, E, and Q) are large, hydrophilic amino acids; and methionine, isoleucine,leucine and valine (M, I, L, and V) are small, hydrophobic amino acids.(a) Based on the e ntries in the PAM 250 matrix, which of the following substitutions are you morelikely to observe in highly diverged sequences? Show the evidence on which you base your answer.i. The replacement of a small, hydrophilic amino acid with a small, hydrophobic amino acid.ii. The replacement of a small, hydrophilic amino acid with a large, hydrophilic amino acid.(b) Which property do you think is more important to protein structure: size or


View Full Document

CMU BSC 03711 - Homework

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Homework

Homework

10 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Problem

Problem

6 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

10 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Homework
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?