DOC PREVIEW
CMU BSC 03711 - Homework

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

03-511/711 Computational Genomics and Molecular Biology, Fall 2005 1Problem Set 1 Due Thursday, September 27thCollaboration is allowed on this homework. You must hand in homework assignments individ-ually and list the names of the people you worked with.You may not use a program to do this homework. Turn in your handwritten answers on theattached sheets. Extra alignment templates will be available on the website.1. Pairwise alignment:(a) Compute the local alignment of “TANGENT with “ENTANGLEMENT”, using the fol-lowing scoring system: matches = 1, mismatches = -2, indels = -2. Show your alignmentmatrix with scores and traceback on the attached alignment template.(b) What is the score of the optimal local alignment? How many different optimal alignmentsare there? Give the alignment(s) here.(c) In the space below, show all suboptimal alignments that have a score one less than theoptimal score. How many are there? Do any of them occur more than once in thealignment matrix?03-511/711 Computational Genomics and Molecular Biology, Fall 2005 22. Pairwise alignment:(a) Compute the local alignment of “TANGENT with “ENTANGLEMENT”, this time usingthe following scoring system: matches = 2, mismatches = -1, indels = -1. Show youralignment matrix with scores and traceback on the attached alignment template.(b) What is the score of the optimal local alignment? How many different optimal alignmentsare there? Show them.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 33. Semiglobal alignment:(a) Compute the semiglobal alignment of “TANGENT with “ENTANGLEMENT”, usingthe following scoring system: matches = 3, mismatches = -2, indels = -2, with nopenalty for aligning indels with the leading characters of “ENTANGLEMENT”. (Thatis, indels can be inserted at the beginning of “TANGENT” with no cost.) Assume apenalty for gaps at the end of both sequences.Show your alignment matrix with scores and traceback on the template provided. Toreduce the amount of work you have to do, low scoring areas of the matrix have beenblacked out on the template. You do not need to fill in those cells.(b) What is the score of the optimal semiglobal alignment? How many different optimalalignments are there? Show them.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 44. Comparing semiglobal and local alignments(a) Did you get the same optimal local alignments in Problems 1 and 2? Why or why not?(b) Compare the local alignments from Problems 1 and 2 with the semiglobal alignmentyou obtained in Problem 3. Under what conditions, will local alignment give the sameresults as semi-global alignment for this problem?(c) Although it sometimes possible to obtain the same results with local and global align-ment, they are fundamentally different methods and will not in general give the sameresult. Why?03-511/711 Computational Genomics and Molecular Biology, Fall 2005 55. Scoring functions:(a) What four properties must a scoring function have to be appropriate for local alignment?Consider the following scoring functions, where M designates a match, m designates a mis-match and g designates an indel. For each of the scoring functions, state whether it is suitablefor pairwise local alignment. If it is not suitable, why not?(b) M = 2, m = 2, g = 2.(c) M = 2, m = −2, g = −2.(d) M = 3, m = 0, g = 0.(e) M = 5, m = −5, g = −2.(f) M = −1, m = −2, g = −2.(g) M = 0, m = 1, g = 1.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 66. You are given two sequences U = u1, u2, . . . , umand T = t1, t2, . . . , tn. For each of thefollowing problems, state(i) which algorithm you would use to solve it: global alignment, local alignment or semi-global alignment.(ii) initialization: show the values in the first row and first column in the dynamic program-ming matrix, a[ ].(iii) give the recurrence relation required to fill in the dynamic programming matrix a[i, j], interms of a gap cost, g, and a scoring matrix, S[i, j], which gives the score of matches andmismatches. State the constraints, if any, that must be imposed on the scoring function,S[ ].(iv) an expression in terms of a[ ], m and n for determining the score of the optimal alignment.(a) As a preprocessing step for sequence assembly, it is necessary to identify pairs of DNAsequence fragments that overlap. Let T and U be two sequence fragments. We wish todetermine whether the end of T overlaps with the beginning of U .i. Algorithm:ii. Initialization:t1t2t3. . .u1u2u3. . .iii. The recurrence relation; constraints on S[ ], if any.iv. The score of the optimal alignment.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 7(b) Given the operations insertion, deletion and substitution, the problem is to determinethe minimum number of operations required to transform T into U .i. Algorithm:ii. Initialization:t1t2t3. . .u1u2u3. . .iii. The recurrence relation; constraints on S[ ], if any.iv. The score of the optimal alignment.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 8(c) T is a cDNA sequence and U is a genomic sequence, both taken from the same bacterialgenome. We wish to find the location of the gene that encoded the cDNA.i. Algorithm:ii. Initialization:t1t2t3. . .u1u2u3. . .iii. The recurrence relation; constraints on S[ ], if any.iv. The score of the optimal alignment.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 9(d) T is a cDNA sequence and U is a genomic sequence, both taken from a higher eukaryote.We wish to find the exon boundaries in the ge nomic DNA.i. Algorithm:ii. Initialization:t1t2t3. . .u1u2u3. . .iii. The recurrence relation; constraints on S[ ], if any.iv. The Score of the optimal alignment.03-511/711 Computational Genomics and Molecular Biology, Fall 2005 107. Given the distance function, M = 0, m = 1, g = 1, what is the relationship (<, ≤, =,etc.) between the quantities X and Y in each of the following cases? Give a one sentenceexplanation of your answer.(a) X = the sum-of-pairs (SP) score of a multiple sequence alignment.Y = the tree alignment score of the same multiple sequence alignment.(b) X = the SP score of a multiple sequence alignment obtained by dynamic programming.Y = the SP score of a multiple alignment of the same sequences obtained by progressivealignment.(c) X = the SP score of the optimal multiple alignment of sequences S1. . . Sk.Y =Pki=1Pkj>iD(Si, Sj), where D(Si, Sj) is the optimal pairwise score between seq-uences


View Full Document

CMU BSC 03711 - Homework

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Problem

Problem

6 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

10 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Homework
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?