DOC PREVIEW
Stanford CS 262 - Sequence Alignment

This preview shows page 1-2-3-21-22-23-43-44-45 out of 45 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 45 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Sequence AlignmentSlide 2The Needleman-Wunsch AlgorithmThe Smith-Waterman algorithmScoring the gaps more accuratelyCompromise: affine gapsNeedleman-Wunsch with affine gapsSlide 8To generalize a little…Bounded Dynamic ProgrammingSlide 11Linear-Space AlignmentSubsequences and SubstringsHirschberg’s algortihmIntroduction: Compute optimal scoreLinear-space alignmentSlide 17Slide 18Slide 19Slide 20Slide 21Slide 22Heuristic Local AlignerersState of biological databasesSlide 25Some useful applications of alignmentsSlide 27Indexing-based local alignmentSlide 29Indexing-based local alignment—ExtensionsIndexing-based local alignment—ExtensionsSlide 32Sensitivity-Speed TradeoffSlide 34Measured improvementNon-consecutive words—PatternsAdvantage of PatternsMultiple patternsVariants of BLASTExampleSlide 41The Four-Russian Algorithm brief overview A (not so useful) speedup of Dynamic Programming [Arlazarov, Dinic, Kronrod, Faradzev 1970]Main ObservationThe Four-Russian AlgorithmSlide 45Sequence AlignmentCS262 Lecture 3, Win06, BatzoglouSequence Alignment-AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGACDefinitionGiven two strings x = x1x2...xM, y = y1y2…yN,an alignment is an assignment of gaps to positions0,…, N in x, and 0,…, N in y, so as to line up each letter in one sequence with either a letter, or a gapin the other sequenceAGGCTATCACCTGACCTCCAGGCCGATGCCCTAGCTATCACGACCGCGGTCGATTTGCCCGACCS262 Lecture 3, Win06, BatzoglouThe Needleman-Wunsch Algorithm1. Initialization.F(0, 0) = F(0, j) = F(i, 0) = 02. Main Iteration. a. For each i = 1……M For each j = 1……N F(i-1,j-1) + s(xi, yj) [case 1] F(i, j) = max F(i-1, j) – d [case 2] F(i, j-1) – d [case 3]DIAG, if [case 1] Ptr(i,j) = LEFT, if [case 2]UP,if [case 3]3. Termination. F(M, N) is the optimal score, andfrom Ptr(M, N) can trace back optimal alignmentCS262 Lecture 3, Win06, BatzoglouThe Smith-Waterman algorithmIdea: Ignore badly aligning regionsModifications to Needleman-Wunsch:Initialization: F(0, j) = F(i, 0) = 00Iteration: F(i, j) = max F(i – 1, j) – dF(i, j – 1) – dF(i – 1, j – 1) + s(xi, yj)CS262 Lecture 3, Win06, BatzoglouScoring the gaps more accuratelySimple, linear gap model:Gap of length nincurs penalty ndHowever, gaps usually occur in bunchesConvex gap penalty function:(n):for all n, (n + 1) - (n)  (n) - (n – 1) Algorithm: O(N3) time, O(N2) space(n)(n)CS262 Lecture 3, Win06, BatzoglouCompromise: affine gaps(n) = d + (n – 1)e | | gap gap open extendTo compute optimal alignment,At position i, j, need to “remember” best score if gap is open best score if gap is not openF(i, j): score of alignment x1…xi to y1…yjifif xi aligns to yjG(i, j): score ifif xi aligns to a gap after yjH(i, j): score ifif yj aligns to a gap after xiV(i, j) = best score of alignment x1…xi to y1…yjde(n)CS262 Lecture 3, Win06, BatzoglouNeedleman-Wunsch with affine gapsWhy do we need matrices F, G, H?•xi aligns to yjx1……xi-1 xi xi+1y1……yj-1 yj -2. xi aligns to a gapx1……xi-1 xi xi+1y1……yj …- -Add -dAdd -eG(i+1, j) = V(i, j) – d G(i+1, j) = G(i, j) – e Because, perhapsG(i, j) < V(i, j)(it is best to align xi to yj if we were aligningonly x1…xi to y1…yj and not the rest of x, y),but on the contraryG(i, j) – e > V(i, j) – d (i.e., had we “fixed” our decision that xi alignsto yj, we could regret it at the next step whenaligning x1…xi+1 to y1…yj)CS262 Lecture 3, Win06, BatzoglouNeedleman-Wunsch with affine gapsInitialization: V(i, 0) = d + (i – 1)eV(0, j) = d + (j – 1)eIteration:V(i, j) = max{ F(i, j), G(i, j), H(i, j) }F(i, j) = V(i – 1, j – 1) + s(xi, yj)V(i – 1, j) – d G(i, j) = maxG(i – 1, j) – eV(i, j – 1) – d H(i, j) = maxH(i, j – 1) – eTermination: V(i, j) has the best alignmentTime?Space?CS262 Lecture 3, Win06, BatzoglouTo generalize a little…… think of how you would compute optimal alignment with this gap function….in time O(MN)(n)CS262 Lecture 3, Win06, BatzoglouBounded Dynamic ProgrammingAssume we know that x and y are very similarAssumption: # gaps(x, y) < k(N)xi Then, | implies | i – j | < k(N) yjWe can align x and y more efficiently:Time, Space: O(N  k(N)) << O(N2)CS262 Lecture 3, Win06, BatzoglouBounded Dynamic ProgrammingInitialization:F(i,0), F(0,j) undefined for i, j > kIteration:For i = 1…M For j = max(1, i – k)…min(N, i+k)F(i – 1, j – 1)+ s(xi, yj)F(i, j) = max F(i, j – 1) – d, if j > i – k(N)F(i – 1, j) – d, if j < i + k(N)Termination: sameEasy to extend to the affine gap casex1 ………………………… xMy1 ………………………… yNk(N)CS262 Lecture 3, Win06, BatzoglouLinear-Space AlignmentCS262 Lecture 3, Win06, BatzoglouSubsequences and SubstringsDefinition A string x’ is a substring of a string x,if x = ux’v for some prefix string u and suffix string v(similarly, x’ = xi…xj, for some 1  i  j  |x|) A string x’ is a subsequence of a string xif x’ can be obtained from x by deleting 0 or more letters(x’ = xi1…xik, for some 1  i1  …  ik  |x|)Note: a substring is always a subsequenceExample: x = abracadabray = cadabr; substringz = brcdbr; subseqence, not substringCS262 Lecture 3, Win06, BatzoglouHirschberg’s algortihmGiven a set of strings x, y,…, a common subsequence is a string u that is a subsequence of all strings x, y, …•Longest common subsequenceGiven strings x = x1 x2 … xM, y = y1 y2 … yN,Find longest common subsequence u = u1 … uk•Algorithm:F(i – 1, j)•F(i, j) = max F(i, j – 1)F(i – 1, j – 1) + [1, if xi = yj; 0 otherwise]•Ptr(i, j) = (same as in N-W)•Termination: trace back from Ptr(M, N), and prepend a letter to u whenever •Ptr(i, j) = DIAG and F(i – 1, j – 1) < F(i, j)•Hirschberg’s algorithm solves this in linear spaceCS262 Lecture 3, Win06, BatzoglouF(i,j)Introduction: Compute optimal scoreIt is easy to compute F(M, N) in linear spaceAllocate ( column[1] )Allocate ( column[2] )For i = 1….MIf i > 1, then:Free( column[i – 2] )Allocate( column[ i ] )For j = 1…NF(i, j) = …CS262 Lecture 3, Win06, BatzoglouLinear-space alignmentTo compute both the optimal score and the optimal alignment:Divide & Conquer approach:Notation:xr, yr: reverse of x, yE.g. x = accgg;xr = ggccaFr(i, j): optimal score of aligning xr1…xri


View Full Document

Stanford CS 262 - Sequence Alignment

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Sequence Alignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sequence Alignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sequence Alignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?