UCSD CSE 182 - Lecture - D607069

Home> Schools> University of California, San Diego> Computer Science & Engineering (CSE) > CSE 182> Lecture

DOC PREVIEW

UCSD CSE 182 - Lecture

School name University of California, San Diego

Course Cse 182- Biological Databases

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Fa08! CSE 182!Back to Smith Waterman i!i-1!j!j-1!Fa08! CSE 182!An Example • Align s=TCAT with t=TGCAA • Match Score = 1 • Mismatch score = -1, Indel Score = -1 • C(‘T’,’T’) = ? ; C(‘T’,’A’)=? ; C(‘-’,’A’) =? • Global Score A1?, Score A2? T C A T -!T G C A A!T C A T!T G C A A!A1 A2Fa08! CSE 182!T G C A A!T!C!A!T! 1! 1! -1! -2! -2! -4! 1! 2! 0! -1! -1! -3! -1! 0! 1! 0! 0! -2! -3! -2! -1! 0! 1! -1! -5! -4! -3! -2! -1! 0!Example: Global alignment € S[i, j] = maxS[i −1, j −1] + C(si,tj)S[i −1, j] + C(si,−)S[i, j −1] + C(−,tj)      Fa08! CSE 182!Local Alignment • The original recurrence still works, except when the optimum score S[i,j] is negative • When S[i,j] <0, it means that the optimum local alignment cannot include the point (i,j). • So, we must reset the score to 0. si!tj!i!i-1!j!j-1!Fa08! CSE 182!Local Alignment Trick (Smith-Waterman algorithm) How can we compute the local alignment itself?!€ S[i, j] = max0S[i −1, j −1] + C(si,tj)S[i −1, j] + C(si,−)S[i, j −1] + C(−,tj)    Fa08! CSE 182!T G C A A!T!C!A!T! 1! 1! 0! 0! 0! 0! 1! 2! 0! 0! 0! 0! 0! 0! 1! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 1! 0!Ex: Local Alignment € S[i, j] = max0S[i −1, j −1] + C(si,tj)S[i −1, j] + C(si,−)S[i, j −1] + C(−,tj)    Fa08! CSE 182!Gaps appear together!Fa08! CSE 182!Generalizing Gap Cost • It is more likely for gaps to be contiguous • The penalty for a gap of length l should be € go + ge ∗ lFa08! CSE 182!Using affine gap penalties € S[i, j] = maxl0S[i −1, j −1] + C(si,tj)S[i − l, j] + go + ge * lS[i, j − l] + go + ge * l     • What is the time taken for this? • What are the values that l can take? • Can we get rid of the extra Dimension?Fa08! CSE 182!Affine gap penalties • Define D[i,j] : Score of the best alignment, given that the final column is a ‘deletion’ (si is aligned to a gap) • Define I[i,j]: Score of the best alignment, given that the final column is an insertion (tj is aligned to a gap) € S[i, j] = max0S[i −1, j −1] + C(si,tj)D[i, j]I[i, j]     s[i]!Optimum alignment of s[1..i-1], and t[1..j]! -!Optimum alignment of s[1..i], and t[1..j-1]!t[j]!Fa08! CSE 182!O(nm) solution for affine gap costs € D[i, j] = maxD[i −1, j] + geS[i −1, j] + go + ge   € I[i, j] = maxI[i, j −1] + geS[i, j −1] + go + ge  Fa08! CSE 182!Alignment Space? • How much space do we need? • Is the space requirement too much? € S[i, j] = maxS[i −1, j −1] + C(si,tj)S[i −1, j] + C(si,−)S[i, j −1] + C(−,tj)    Fa07! CSE 182!Copyright ©2004 by the National Academy of Sciences Istrail, Sorin et al. (2004) Proc. Natl. Acad. Sci. USA 101, 1916-1921 Fig. 1. Dot-plot representation of sample assembly comparison results Ex: Genomic assembly comparisons (6Mb)Fa08! CSE 182!Alignment (Linear Space) • Score computation € i2= i%2; i1= (i −1)%2;S[i2, j] = maxS[i1, j −1] + C(si,tj)S[i1, j] + C(si,−)S[i2, j −1] + C(−,tj)            For i = 1 to n! For j = 1 to m !€ S[i, j] = maxS[i −1, j −1] + C(si,tj)S[i −1, j] + C(si,−)S[i, j −1] + C(−,tj)    Fa08! CSE 182!Linear Space Alignment • In Linear Space, we can do each row of the D.P. • We need to compute the optimum path from the origin (0,0) to (m,n)Fa07! CSE 182!Linear Space (cont’d) • At i=n/2, we know scores of all the optimal paths ending at that row. • Define F[j] = S[n/2,j] • One of these j is on the true path. Which one?Fa08! CSE 182!Backward alignment • Let Sb[i,j] be the optimal score of aligning s[i+1..n] with t[j+1..m] € Sb[i, j] = maxSb[i + 1, j + 1] + C(si,tj)Sb[i + 1, j] + C(si,−)Sb[i, j + 1] + C(−,tj)     • Boundary cases? • Sb[n,j]? Sb[m,j]?Fa08! CSE 182!Backward alignment • Let Sb[i,j] be the optimal score of aligning s[i+1..n] with t[j+1..m] • Define B[j] = Sb[n/2,j] • One of these j is on the true path. Which one?Fa08! CSE 182!Forward, Backward computation • At the optimal coordinate, j – F[j]+B[j]=S[n,m] • In O(nm) time, and O(m) space, we can compute one of the coordinates on the optimum path.Fa08! CSE 182!Linear Space Alignment • Align(1..n,1..m) – For all 1<=j <= m • Compute F[j]=S(n/2,j) – For all 1<=j <= m • Compute B[j]=Sb(n/2,j) – j* = maxj {F[j]+B[j] } – X = Align(1..n/2,1..j*) – Y = Align(n/2+1..n,j*+1..m) – Return X,j*,YFa08! CSE 182!Linear Space complexity • T(nm) = c.nm + T(nm/2) = O(nm) • Space =

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-20-21 out of 21 pages.

UCSD CSE 182 - Lecture

Sign up for free to view:

Please select your school