DOC PREVIEW
U of I CS 498 - Dynamic Programming

This preview shows page 1-2-3-4-25-26-27-52-53-54-55 out of 55 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Dynamic Programming:Sequence alignmentCS 498 SSSaurabh SinhaDNA Sequence Comparison: FirstSuccess Story• Finding sequence similarities with genes ofknown function is a common approach toinfer a newly sequenced gene’s function• In 1984 Russell Doolittle and colleaguesfound similarities between cancer-causinggene and normal growth factor (PDGF) gene• A normal growth gene switched on at thewrong time causes cancer !Cystic Fibrosis• Cystic fibrosis (CF) is a chronic and frequently fatal geneticdisease of the body's mucus glands. CF primarily affects therespiratory systems in children.• If a high % of cystic fibrosis (CF) patients have a certainmutation in the gene and the normal patients don’t, then thatcould be an indicator of a mutation that is related to CF• A certain mutation was found in 70% of CF patients,convincing evidence that it is a predominant geneticdiagnostics marker for CFCystic Fibrosis and CFTR Gene :Bring in the Bioinformaticians• Gene similarities between two genes with knownand unknown function alert biologists to somepossibilities• Computing a similarity score between two genestells how likely it is that they have similarfunctions• Dynamic programming is a technique forrevealing similarities between genesMotivating DynamicProgrammingDynamic programming example:Manhattan Tourist ProblemImagine seeking apath (from sourceto sink) to travel(only eastward andsouthward) with themost number ofattractions (*) in theManhattan gridSink******** ***Source*Imagine seeking apath (from sourceto sink) to travel(only eastward andsouthward) with themost number ofattractions (*) in theManhattan gridSink******** ***Source*Dynamic programming example:Manhattan Tourist ProblemManhattan Tourist Problem: FormulationGoal: Find the longest path in a weightedgrid.Input: A weighted grid G with two distinctvertices, one labeled “source” and the otherlabeled “sink”Output: A longest path in G from “source” to“sink”MTP: An Example3 2 40 7 33 301 3 24456465582250 1 2 30123j coordinatei coordinate13sourcesink43 2 4 01 024 331122241995152302034MTP: Greedy Algorithm Is Not Optimal1 2 5 2152 340 0 05303501035512promising start,but leads tobad choices!sourcesink1822MTP: Simple RecursiveProgramMT(n,m) if n=0 or m=0 return MT(n,m) x  MT(n-1,m)+ length of the edge from (n- 1,m) to (n,m) y  MT(n,m-1)+ length of the edge from (n,m-1) to (n,m) return max{x,y}What’s wrong with this approach?Here’s what’s wrong• M(n,m) needs M(n, m-1) and M(n-1, m)• Both of these need M(n-1, m-1)• So M(n-1, m-1) will be computed atleast twice• Dynamic programming: the same ideaas this recursive algorithm, but keep allintermediate results in a table and reuse150 101isource15S1,0 = 5S0,1 = 1• Calculate optimal path score for each vertex in the graph• Each vertex’s score is the maximum of the prior verticesscore plus the weight of the respective edge in betweenMTP: Dynamic ProgrammingjMTP: Dynamic Programming(cont’d)1 2530 1 2012source1 3584S2,0 = 8iS1,1 = 4S0,2 = 33-5jMTP: Dynamic Programming(cont’d)1 2530 1 2 30123isource1 358840581035-59131-5S3,0 = 8S2,1 = 9S1,2 = 13S3,0 = 8jMTP: Dynamic Programming(cont’d)greedy alg. fails!1 2 5-5 1 -5-53053035010-3-50 1 2 30123isource1 3 85884913 8912S3,1 = 9S2,2 = 12S1,3 = 8jMTP: Dynamic Programming(cont’d)1 2 5-5 1 -5-53 30 053035010-3-5-520 1 2 30123isource1 3 85884913 8129159jS3,2 = 9S2,3 = 15MTP: Dynamic Programming(cont’d)1 2 5-5 1 -5-53 30 053035010-3-5-520 1 2 30123isource1 3 85884913 8129159j0116S3,3 = 16(showing all back-traces)Done!MTP: RecurrenceComputing the score for a point (i,j) by therecurrence relation:si, j =maxsi-1, j + weight of the edge between (i-1, j) and (i, j)si, j-1 + weight of the edge between (i, j-1) and (i, j)The running time is n x m for a n by m grid(n = # of rows, m = # of columns)Manhattan Is Not A Perfect GridWhat about diagonals?• The score at point B is given by:sB =maxofsA1 + weight of the edge (A1, B)sA2 + weight of the edge (A2, B)sA3 + weight of the edge (A3, B)BA3A1A2Manhattan Is Not A Perfect Grid (cont’d)Computing the score for point x is given by therecurrence relation:sx =maxofsy + weight of vertex (y, x) where y є Predecessors(x)• Predecessors (x) – set of vertices that have edgesleading to x•The running time for a graph G(V, E)(V is the set of all vertices and E is the set of all edges)is O(E) since each edge is evaluated onceTraveling in the Grid•By the time the vertex x is analyzed, the valuessy for all its predecessors y should be computed– otherwise we are in trouble.•We need to traverse the vertices in some order•For a grid, can traverse vertices row by row,column by column, or diagonal by diagonal•If we had, instead of a (directed) grid, anarbitrary DAG (directed acyclic graph) ?Traversing the Manhattan Grid• 3 different strategies:– a) Column by column– b) Row by row– c) Along diagonalsa) b)c)Topological Ordering• A numbering of vertices of the graph iscalled topological ordering of the DAG ifevery edge of the DAG connects a vertexwith a smaller label to a vertex with alarger label• In other words, if vertices are positioned ona line in an increasing order of labels thenall edges go from left to right.• How to obtain a topological sort (???)Longest Path in DAG Problem• Goal: Find a longest path between twovertices in a weighted DAG• Input: A weighted DAG G with source andsink vertices• Output: A longest path in G from source tosinkLongest Path in DAG: Dynamic Programming• Suppose vertex v has indegree 3 andpredecessors {u1, u2, u3}• Longest path to v from source is:In General:sv = maxu (su + weight of edge from u to v)sv =maxofsu1 + weightofedgefrom u1 tovsu2 + weightofedgefrom u2 tovsu3 + weightofedgefrom u3 tovAlignmentAligning DNA SequencesV = ATCTGATGW = TGCATACn = 8m = 7CATACGTGTAGTCTAV W matchdeletioninsertionmismatchindels4122 matches mismatches insertions deletionsAlignment : 2 x k matrix ( k ≥ m, n )Longest Common Subsequence (LCS) –Alignment without Mismatches• Given two sequences v = v1 v2…vm and w = w1 w2…wn• The LCS of v and w is a sequence of positions inv: 1 < i1 < i2 < … < it < mand a sequence of positions inw: 1 < j1 < j2 < … < jt < nsuch that it -th letter of v equals to jt-letter of w and tis maximalLCS: ExampleA T--C T G A T C--T G C T--A--Celements of velements of w--A12012233435455666778j coords:i


View Full Document

U of I CS 498 - Dynamic Programming

Documents in this Course
Lecture 5

Lecture 5

13 pages

LECTURE

LECTURE

39 pages

Assurance

Assurance

44 pages

LECTURE

LECTURE

36 pages

Pthreads

Pthreads

29 pages

Load more
Download Dynamic Programming
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dynamic Programming and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dynamic Programming 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?