DOC PREVIEW
UMD CMSC 423 - RNA folding

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

RNA FoldingCMSC 423: RNA FoldingSlides By: Carl KingsfordDepartment of Computer ScienceUniversity of Maryland, College ParkBased on Section 6.5 of Algorithm Design by Kleinberg & Tardos.RNA FoldingRNA: ACGGGGUUUAAAUUCCCUAUATIn the cell, RNA folds up:•G and C stick together•A and U stick togetherRNA Fold ExampleG CCGUAA UU AC GCGAUAUGGGUAAAA G C CGGCUUAAAGACCGGUCUUUACCCCGGAUAUGCCCCAARNA Folding RulesRNA Folding Rules:1 If two bases are closer than 4 bases apart, they cannot pair.2 Each base is matched to at most one other base.3 The allowable pairs are {U, A} and {G , C }.4 Pairs cannot “cross.”G CCGUAA UU AC GCGAUG CCG UAA UU AC G CG AUNo CrossingsThis is not allowed:i jk lIn other words, if (i, j) and (k, `) are paired, then we must havei < k < ` < j.Paired bases have to be nested.RNA Folding ProblemRNA FoldingGiven: a string r = b1, b2, b3, . . . , bn, with bi∈ {A,C,U,G}Find: the largest set of pairs S = {(i, j)}, wherei, j ∈ {1, 2, . . . , n}, that satisfies the RNA Folding Rules.Goal: match as many pairs of bases as possible.SubproblemsG CCG UUA UU AC G CG AU1 jG CCG UUA UU AC G CG AU1 jj is not paired with anythingj is paired with some t ≤ j -4tOPT(t+1, j-1)OPT(1, t-1)OPT(1, j-1)Subproblems, 2We have a subproblem for every interval (i, j).How many subproblems are there?n2= O(n2)Subproblems, 2We have a subproblem for every interval (i, j).How many subproblems are there?n2= O(n2)RecurrenceIf j − i ≤ 4: OPT (i, j) = 0If j − i > 4:OPT (i, j) = max(OPT (i, j − 1)maxt{1 + OPT (i, t − 1) + OPT (t + 1, j − 1)}In the 2nd case above, we try all possible t to pair with j.That is t runs from i to j − 4.Order to Solve the Problems•In what order should we solve the subproblems?•What problems do we need to solve OPT (i , j)?•OPT (i, t − 1) and OPT (t + 1, j − 1)for every t between i and j.•In what sense are these problems “smaller?”•They involve smaller intervals of the string.Subproblem OrderingWe solve OPT (i , j) in order of increasing value of j − i .Order to Solve the Problems•In what order should we solve the subproblems?•What problems do we need to solve OPT (i , j)?•OPT (i, t − 1) and OPT (t + 1, j − 1)for every t between i and j.•In what sense are these problems “smaller?”•They involve smaller intervals of the string.Subproblem OrderingWe solve OPT (i , j) in order of increasing value of j − i .Order to Solve the Problems•In what order should we solve the subproblems?•What problems do we need to solve OPT (i , j)?•OPT (i, t − 1) and OPT (t + 1, j − 1)for every t between i and j.•In what sense are these problems “smaller?”•They involve smaller intervals of the string.Subproblem OrderingWe solve OPT (i , j) in order of increasing value of j − i .PseudocodeInitialize OPT[i,j] to 0 for 1 ≤ i, j ≤ nFor k = 5, 6,. . ., n-1 // interval lengthFor i = 1,2,. . ., n-k // interval startSet j = i + k // interval end// find the best tbest_t = 0For t = i,. . .,j-1:If rna[t] is complementary to rna[j]:best_t = max(best_t, 1 + OPT[i,t-1]+OPT[t+1,j-1])// Either pair j with t or nothingOPT[i,j] = max(best_t, OPT[i,j-1])EndForEndForReturn OPT[1,n]Running Time1 O(n2) subproblems.2 Now, it takes O(n) time to solve each subproblem.(have to search over all possible choices for t.)3 Total running time is O(n3).Codedef rnafold(rna):n = len(rna)OPT = make_matrix(n, n)Arrows = make_matrix(n, n)for k in xrange(5, n): # interval lengthfor i in xrange(n-k): # interval startj = i + k # interval endbest_t = OPT[i][j-1]arrow = -1for t in xrange(i, j):if is_complement(rna[t], rna[j]):val = 1 + (OPT[i][t-1] if t > i else 0) + OPT[t+1][j-1]if val >= best_t: best_t, arrow = val, tOPT[i][j] = best_tArrows[i][j] = arrowreturn OPT, ArrowsBacktrace Codedef rna_backtrace(Arrows):Pairs = [] # holds the pairs in the optimal solutionStack = [(0, len(Arrows) - 1)] # tracks cells we have to visitwhile len(Stack) > 0:i, j = Stack.pop()if j - i <= 4: continue # if cell is base case, skip it# Arrow = -1 means we didn’t match jif Arrows[i][j] == -1:Stack.append((i, j - 1))else:t = Arrows[i][j]Pairs.append((t, j)) # save that j matched with t# add the two daughter problemsif t > i: Stack.append((i, t - 1))Stack.append((t + 1, j - 1))return PairsNotes & Recap•This is essentially “Nussinov’s algorithm”, which wasproposed for finding RNA structures in 1978.•Same dynamic programming idea: write the answer to the fullproblem in terms of the answer to smaller problems.•Still have an O(n × n) matrix to fill in.•Main differences from sequence alignment:1 We fill in the matrix in a different order: entries (i , j) in orderof increasing j − i.2 We have to try O(n) possible subproblems inside the max.This leads to an O(n3)


View Full Document

UMD CMSC 423 - RNA folding

Documents in this Course
Midterm

Midterm

8 pages

Lecture 7

Lecture 7

15 pages

Load more
Download RNA folding
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view RNA folding and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RNA folding 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?