DOC PREVIEW
CMU BSC 03711 - Problem

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fall 2011 Computational Genomics and Molecular Biology 1Problem Set 4Collaboration is allowed on this homework. You must hand in homeworks individually and list the namesof the people you worked with. Homework must be submitted by 5pm in MI646 or electronically to [email protected] Friday, December 9th1. The Kimura 2-parameter model of sequence evolution distinguishes b etween transitions (purine-purineand pyrimidine-pyrimidine replacements) and transversions (purine-pyrimidine and pyrimidine-purinereplacements). Under the Kimura model, the expected number of sites at which a substitution occurredcan be estimated from the number of mismatching sites byd = −N (0.5 ln(1 − 2ˆp1− ˆp2) + 0.25 ln(1 − 2ˆp2)),where N is the length of the alignment and ˆp1and ˆp2are the number of transitions per site and thenumbe r of transversions per site, res pe ctively.(a) Supp ose you are given two sequences of length 200 that differ by 20 transitions and 4 transversions.i. What is the expected number of substitutions that occurred s ince these sequences dive rgedfrom their common ancestor?ii. What is the difference between the number of observed mismatches and the substitutiondistance estimated by the model?iii. Use the Jukes Cantor model instead of the Kimura 2-parameter model to estimate the ex-pected number of substitutions. What is the expected number of substitutions according tothe JC model?iv. What is the difference between the number of observed mismatches and the substitutiondistance estimated by the model?Fall 2011 Computational Genomics and Molecular Biology 2(b) Suppos e you are given two other sequences of length 200. These sequences differ by 50 transitionsand 16 transversions.i. What is the expected number of substitutions that occurred since these sequences divergedfrom their common ancestor according to the K2P model?ii. What is the difference between the number of observed mismatches and the substitutiondistance estimated by the model?iii. What is the expected number of substitutions that occurred s ince these sequences divergedfrom their common ancestor according to the JC model?iv. What is the difference between the number of observed mismatches and the substitutiondistance estimated by the model?(c) Based on your results, does it matte r which model you use? Is your answer the same for both ofthe above examples?Fall 2011 Computational Genomics and Molecular Biology 32. Consider the following matrix of observe d distances between four taxa, A, B, C and D:B C DA 9 18 19B 19 20C 5(a) Does your matrix fit a tree? How do you know?(b) Are all sequences in this data set changing at the same rate? How do you know?(c) Which of the three unrooted topologies with four leaves is preferred by this distance matrix?(Hint: to find just the preferred topology, without inferring the branch lengths, you do not needto apply an algorithm.)Fall 2011 Computational Genomics and Molecular Biology 43. Under the maximum parsimony criterion, we say a column, or site, in a multiple sequence alignmentis informative, if it favors one tree topology over another. If the parsimony score at a given site in thealignment is the same for all topologies, then the site in uniformative.(a) For each site in the following alignment of sequences from four taxa,1 2 3 4 5 6 7 8 9X. C C G T A G G A CY. A C C T G T G T CZ. A G A T G T G C CW. A G T T A G G C Cstatei. if it is an informative siteii. if so, which of the p os sible tree topologies for four taxa does it favor?iii. if not, what is the parsimony score for this site?(b) Show the most parsimonious tree(s).(c) What is the maximum parsimony score for this data set?Fall 2011 Computational Genomics and Molecular Biology 54. What is the parsimony score of the following tree? Mark mutations on branches and show the inferredset of bases at each internal node.Fall 2011 Computational Genomics and Molecular Biology 65. (a) How many rooted tree topologies are there for seven species, A, B, C, D, E, F and G?(b) Suppos e you know that species A, B, C and D are grouped together in the left subtree and E, Fand G are are grouped together in the right subtree. Under this constraint, how many alternaterooted tree hypotheses are there for A, B, C, D, E, F and


View Full Document

CMU BSC 03711 - Problem

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Homework

Homework

10 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

10 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Problem
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?