DOC PREVIEW
Stanford CS 374 - RNA secondary structure prediction and runtime optimization

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS374 Fall 2006 Lecture 4, 10/05/06 RNA secondary structure prediction and runtime optimization Lecturer: Greg Goldgof Scribe: Chuan Sheng Foo RNA secondary structure prediction and runtime optimization Based on the following papers: 1. Y. Wexler, C. Zilberstein and M. Ziv-Ukelson. "A Study of Accessible Motifs and RNA Folding Complexity". RECOMB 2006, LNBI 3909: 473-487, 2006. 2. Do CB, Woods DA, Batzoglou S. CONTRAfold: RNA Secondary Structure Prediction without Physics-Based Models. ISMB 2006 Conference Proceedings, Bioinformatics 22:e90–e98, 2006. Additional references: 1. Lecture slides by Greg Goldgof Outline 1. Background  RNA secondary structure  Pseudoknots  Non-coding RNA 2. CONTRAfold: Probabilistic RNA folding  Overview of the algorithm  Details of the algorithm  Performance of CONTRAfold 3. Other RNA folding methods: Physics-based models and Stochastic Context Free Grammars  Physics-based models  Stochastic Context Free Grammars  Advantages of CONTRAfold over these other approaches 4. How RNA folding is done from an algorithmic perspective  The Nussinov folding algorithm  A more elaborate algorithm 5. CandidateFold: RNA folding in O(n2) time 6. Genome-wide “accessible” motif detection  What is an RNA regulatory motif?  What is an “accessible” motif?  How do Wexler et al. detect regulatory motifs?  Results obtained by applying the two-stage process 1. Background Ribonucleic acid (RNA) is a polymer of the nucleotides adenine (A), uracil (U), cytosine (C) and guanine (G) that has been transcribed from DNA. Traditionally, RNA has been seen as a messenger molecule which carries information from the DNA to the translation mechanism which translates it into protein.CS374 Fall 2006 Lecture 4, 10/05/06 RNA secondary structure prediction and runtime optimization Lecturer: Greg Goldgof Scribe: Chuan Sheng Foo RNA secondary structure As opposed to DNA, which is double stranded, RNA is a single stranded molecule. This means that RNA has the ability to fold and bind with itself, taking on a 3D conformation which is its secondary structure. Figure 1 shows a few of the common folds, also known as structural motifs, found in the RNA secondary structure. One such motif is a helix, or stem, which is a long consecutive sequence of hybridized (paired) bases, akin to the dou-ble helical structure of DNA. The hairpin loop is another such motif, and it is a helix fol-lowed by an opening at the end, thus making the structure look like a hairpin; the hairpin loop starts at the end of the helix where the base pairing stops. On the other hand, an in-ternal loop is a motif that forms in the middle of a helix thus breaking it up. The bulge loop is a special case of an internal loop in which only one of the sides is unhybridized (unpaired). Finally, the multi-branch loop is a motif in which there are nested structures within the loop – as shown in Figure 1, the multi-branch loop has two separate branches that forms a helix and a bulge loop respectively. Pseudoknots In addition to the motifs described above, RNA may also form pseudoknots, in which two helices overlap, causing non-nested base pairing as illustrated in Figure 2. The pres-ence of pseudoknots in RNA makes secondary structure prediction computationally in-tractable. Thus, most algorithms ignore them during the computation in order to increase efficiency, since they are also rarely present in nature. In particular, the algorithms dis-cussed in these notes ignore the possibility of pseudoknots. bulge loop helix (stem) hairpin loop multi-branch loop internal loop Figure 1: Common RNA structural motifsCS374 Fall 2006 Lecture 4, 10/05/06 RNA secondary structure prediction and runtime optimization Lecturer: Greg Goldgof Scribe: Chuan Sheng Foo Non-coding RNA Although RNA has been conventionally regarded as having only a messenger function, recent discoveries of many forms of non-coding RNA (ncRNA), RNA that is not eventu-ally translated into proteins, present a challenge to this view. Examples of these include RNA enzymes like ribosomal RNA (rRNA) and transfer RNA (tRNA) which catalyze chemical reactions, RNA interference (RNAi) in which RNA molecules like microRNA (miRNA) and short-interfering RNA (siRNA) mediate gene regulation, and alternative splicing mechanisms carried out by small-nuclear RNA (snRNA). For these ncRNA molecules, their structure is critical to their function. In fact, it has been discovered that the nucleotide sequences of these ncRNAs have not been highly con-served, giving further evidence that the sequence is not of primary importance. However, experimentally determining these RNA structures, for example, through X-ray crystallog-raphy, takes a lot of time and money. Thus, being able to efficiently predict these struc-tures computationally is an important problem in computational biology. 2. CONTRAfold: Probabilistic RNA folding Overview of the algorithm The CONTRAfold algorithm is one such algorithm that is able to predict RNA secondary structure from the sequence. It predicts the most likely secondary structure for an RNA sequence using various features in a structure. For example, the algorithm considers the number of C-G and A-U base pairings (canonical base pairings), which are very frequent in actual RNA structure as they are more energetically favorable than other forms of base pairings, for example, a C-C base pairing. In addition to canonical pairings, CONTRA-fold also considers the number of non-canonical base pairings (G-U pairings), helices, bulge loops and CG/GC base-pair stacking interactions in its assessment of an RNA structure. All these features listed above are thermodynamic parameters, because they represent the energy of the structure; other non-thermodynamic features could also be in-corporated into the algorithm if desired. Figure 2: Overlapping base pairing in an RNA pseudoknotCS374 Fall 2006 Lecture 4, 10/05/06 RNA secondary structure prediction and runtime optimization Lecturer: Greg Goldgof Scribe: Chuan Sheng Foo Details of the algorithm Each feature fi considered by the algorithm is associated with a weight wi, that describes its relative importance amongst all the features being considered. These weights are learnt by the algorithm from a training set of


View Full Document

Stanford CS 374 - RNA secondary structure prediction and runtime optimization

Documents in this Course
Probcons

Probcons

42 pages

ProtoMap

ProtoMap

19 pages

Lecture 3

Lecture 3

16 pages

Load more
Download RNA secondary structure prediction and runtime optimization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view RNA secondary structure prediction and runtime optimization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RNA secondary structure prediction and runtime optimization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?