DOC PREVIEW
U of I CS 498 - Dynamic Programming

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Dynamic Programming(cont’d)CS 498 SSSaurabh SinhaSpliced Alignment• Begins by selecting either all putative exonsbetween potential acceptor and donor sites or byfinding all substrings similar to the target protein(as in the Exon Chaining Problem).• This set is further filtered in a such a way thatattempt to retain all true exons, with some falseones.• Then find the chain of exons such that thesequence similarity to the target proteinsequence is maximizedSplicedAlignmentProblem:Formulation• Input: Genomic sequences G, targetsequence T, and a set of candidateexons (blocks) B.• Output: A chain of exons Γ such thatthe global alignment score between Γ*and T is maximized Γ* - concatenation of all exons from chain ΓThe DAG• Vertices: One vertex for each block in B• Directed edge connecting non-overlapping blocks• Label of vertex = string of block it represents• A path through the DAG spells out the stringobtained by concatenating that particular chain ofblocks• Weight of a path is the score of the optimalalignment between the string it spells out and thetarget sequenceDynamic programming• Genomic sequence G = g1g2…gn• Target sequence T = t1t2…tm• As usual, we want to find the optimalalignment score of the i-prefix of G andthe j-prefix of T• Problem is, there are many i-prefixespossible (since multiple blocks mayinclude position i)Idea• Find the optimal alignment score of thei-prefix of G and the j-prefix of Tassuming that this alignment uses aparticular block B at position i• S(i, j, B)• For every block B that includes iRecurrence If i is not the starting vertex of block B:• S(i, j, B) =max { S(i – 1, j, B) – indel penalty S(i, j – 1, B) – indel penalty S(i – 1, j – 1, B) + δ(gi, tj) } If i is the starting vertex of block B:• S(i, j, B) = max { S(i, j – 1, B) – indel penaltymaxall blocks B’ preceding block B S(end(B’), j, B’) – indel penaltymaxall blocks B’ preceding block B S(end(B’), j – 1, B’) + δ(gi, tj)}RNA secondary structurepredictionRNA• RNA is similar to DNA chemically. It is usually only asingle strand. T(hyamine) is replaced by U(racil)• Some forms of RNA can form secondary structures by“pairing up” with itself. This can have change its properties dramatically.DNA and RNAcan pair witheach other.http://www.cgl.ucsf.edu/home/glasfeld/tutorial/trna/trna.giftRNA linear and 3D view:RNA• There’s more to RNA than mRNA• RNA can adopt interesting non-linearstructures, and catalyze reactions• tRNAs (transfer RNAs) are the“adapters” that implement translationSecondary structure• Several interesting RNAs have a conservedsecondary structure (resulting from base-pairing interactions)• Sometimes, the sequence itself may not beconserved for the function to be retained• It is important to tell what the secondarystructure is going to be, for homologydetectionConserved secondarystructure N-Y A A N-N’ N-N’R N-N’ N-N’ N-N’ N-N’ N-N’/ N Consensus binding site for R17 phage coatprotein. N = A/C/G/U,N’ is a complementarybase pairing to N,Y is C/U, R is A/GSource: DEKMBasics of secondary structure• G-C pairing: three bonds (strong)• A-U pairing: two bonds (weaker)• Base pairs are approximately coplanarBasics of secondary structureBasics of secondary structure• G-C pairing: three bonds (strong)• A-U pairing: two bonds (weaker)• Base pairs are approximately coplanar• Base pairs are stacked onto other basepairs (arranged side by side): “stems”Secondary structure elementsLoop: single stranded subsequences bounded by base pairsloop at the endof a stemstem loopsingle strandedbases withina stem… only on oneside of stem… on bothsides of stemNon-canonical base pairs• G-C and A-U are the canonical base pairs• G-U is also possible, almost as stableNesting• Base pairs almost always occur in anested fashion• If positions i and j are paired, andpositions i’ and j’ are paired, then thesetwo base-pairings are said to be nested if:• i < i’ < j’ < j OR i’ < i < j < j’• Non-nested base pairing: pseudoknotPseudoknot2119 18(9, 18)(2, 11)NOT NESTEDPseudoknot problems• Pseudoknots are not handled by thealgorithms we shall see• Pseudoknots do occur in manyimportant RNAs• But the total number of pseudoknottedbase pairs is typically relatively smallSecondary structure predictionApproach 1.• Find the secondary structure with mostbase pairs.• Nussinov’s algorithm• Recursive: finds best structure for smallsubsequences, and works its wayoutwards to larger subsequencesNussinov’s algorithm: idea• There are only four possible ways ofgetting the best structure forsubsequence (i,j) from the beststructures of the smaller subsequences(1) Add unpaired position i onto beststructure for subsequence (i+1,j)ii+1jNussinov’s algorithm: idea• There are only four possible ways ofgetting the best structure forsubsequence (i,j) from the beststructures of the smaller subsequences(2) Add unpaired position j onto beststructure for subsequence (i,j-1)jj-1iNussinov’s algorithm: idea• There are only four possible ways ofgetting the best structure forsubsequence (i,j) from the beststructures of the smaller subsequences(3) Add (i,j) pair onto best structure forsubsequence (i+1,j-1)ji+1 j-1iNussinov’s algorithm: idea• There are only four possible ways ofgetting the best structure forsubsequence (i,j) from the beststructures of the smaller subsequences(4)Combine two optimal substructures (i,k)and (k+1,j)ik k+1 jNussinov RNA foldingalgorithm• Given a sequence s of length L with symbolss1 … sL. Let δ(i,j) = 1 if si and sj are acomplementary base pair, and 0 otherwise.• We recursively calculate scores g(i,j) whichare the maximal number of base pairs thatcan be formed for subsequence si…sj.• Dynamic programmingRecursion• Starting with all subsequences of length2, to length L• g(i,j) = max of• g(i+1, j)• g(i,j-1)• g(i+1,j-1) + δ(i,j)• maxi < k < j [g(i,k) + g(k+1,j)]• Initialization• g(i,i-1) = 0• g(i,i) = 0O(n2) ? No. O(n3)Traceback• As usual in sequence alignment ?• Optimal sequence alignment is a linearpath in the dynamic programming table• Optimal secondary structure can have“bifurcations”• Traceback uses a pushdown stackTracebackPush (1,L) onto stackRepeat until stack is empty:pop (i,j)if i >= j continueelse if g(i+1,j) = g(i,j) push (i+1,j)else if g(i,j-1) = g(i,j) push (i,j-1)else if


View Full Document

U of I CS 498 - Dynamic Programming

Documents in this Course
Lecture 5

Lecture 5

13 pages

LECTURE

LECTURE

39 pages

Assurance

Assurance

44 pages

LECTURE

LECTURE

36 pages

Pthreads

Pthreads

29 pages

Load more
Download Dynamic Programming
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dynamic Programming and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dynamic Programming 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?