DOC PREVIEW
Stanford CS 374 - Lecture 3

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Searching Genomes for Noncoding RNA CS374 Fall 2004 Lecture 3, 10/3/06 Lecturer: Leticia Britos Scribe: Greg Goldgof 1 Searching Genomes for Noncoding RNA Based on the following papers: 1. Zhang S, B Hass, E Eskin, V Bafna. “Searching genomes for noncoding RNA using FastR”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2(4) October-December 2005. 2. Zhang S, I Borovok,, Y Aharonowitz, R Sharan, V Bafna. “A sequence-based filtering method for noncoding RNA identification and its applica-tion to searching for riboswitch elements”, Bioinformatics, 22(14): e557-e565, 2006. Outline 1. Background 2. Noncoding RNA Prediction 3. Searching Genomes for Noncoding RNA Using FastR 4. FastR’s Genome Filtering Method 5. FastR’s RNA Alignment Method 6. Validation of FastR’s Performance 7. Application of FastR for Discovery of Novel Instances of Known Families of RiboswitchesSearching Genomes for Noncoding RNA CS374 Fall 2004 Lecture 3, 10/3/06 Lecturer: Leticia Britos Scribe: Greg Goldgof 2 1. Background DNA is transcribed to messenger RNA (mRNA) and then translated to proteins. The human genome is composed of roughly three billion bases of DNA. However, there are only twenty-two thousand genes that code for proteins. Con-sequently, genes only make up 1.5% of the genome. Some genes encode RNA molecules that do not code for proteins, but are transcribed into RNA molecules involved in cellular regulatory processes. Some-times these sequences are called RNA genes or noncoding RNAs. There is an “Expanded universe” of noncoding RNA, the importance scientists are appreciat-ing more with every day. Noncoding RNAs play a wide variety of cellular roles including the following: • rRNA – ribosomal RNA (structure/function of ribosomes) • tRNA – transfer RNA (translation) • snRNA – small-nuclear RNA (RNA splicing, telomere maintenance) • snoRNA – small-nucleolar (chemical modification of rRNA) • miRNA – microRNA (translational regulation) • gRNA – guideRNA (mRNA editing) • tmRNA – tRNA/mRNA combination molecule (degradation of defective proteins) • riboswitches (translational and transcriptional regulation - figure 0). • ribozymes (autocatalytic RNA) • RNAi – RNA interference (gene regulation by double-stranded RNA) RNA is a hot topic and of fundamental biological importance. Noncoding RNAs are an essential part of transcription, translation, alternative-splicing, and gene regulation. Just last week, professors Andrew Fire (at Stanford Medical School) and Craig Mello received the Nobel Prize for their discovery of RNA in-terference. Since RNA is usually single-stranded (as opposed to DNA) its bases often bind with each other, causing the RNA polymer to fold upon itself into a specific conformation. The description of which bases form bonds with each other is called the secondary structure of an RNA molecule. The secondary structure of noncoding RNA molecules is frequently more essential to its function than its se-quence. As a result secondary structure is an important part of many tools in-volved in noncoding RNA discovery. This is because the secondary structure of orthologous noncoding RNA sequences is evolutionarily conserved.Searching Genomes for Noncoding RNA CS374 Fall 2004 Lecture 3, 10/3/06 Lecturer: Leticia Britos Scribe: Greg Goldgof 3 Figure 0: Riboswitches - a riboswitch is a part of an mRNA molecule that can directly bind a small target molecule, and whose binding of the target affects the gene's activity. Thus, an mRNA that contains a riboswitch is directly involved in regulating its own activity, depending on the presence or absence of its target mole-cule. Riboswitches are usually found in the 5’ UTR (Untranslated Region) of genes. 2. NonCoding RNA Prediction Since genes make up only a small percentage of the gene prediction has become a standard problem in computational biology. Basic gene prediction pro-grams look for signals such as start and stop codons. These are three-base se-quences that tell the translation machinery to start and stop. They also look for statistically-significant sequence conservation across related genomes. Prediction of RNA genes is more challenging because noncoding RNA signals in the genome are not as strong as the signals for protein coding genes. This is because sequence conservation is frequently statistically insignificant. Of-ten times, only a small percentage of residues in an RNA regulatory molecule must be conserved to maintain its function.Searching Genomes for Noncoding RNA CS374 Fall 2004 Lecture 3, 10/3/06 Lecturer: Leticia Britos Scribe: Greg Goldgof 4 Figure 1: Alignment of two tRNA sequences from Drosophila melanogaster On the other hand, the secondary structure (figure 2) of noncoding RNA mole-cules often usually highly conserved, providing another tool for finding co-variation across genomes. However, structure is also frequently inadequate for de-tecting noncoding RNAs. If we just scan for structure we will find random se-quences that will fold into ways that suggest they are functional. These noncoding RNAs can also be found in a wide range of places. Some noncoding RNAs are whole transcribed units such as rRNA, whereas others such as riboswitches are in the UTRs of genes. They may be in intergenic regions, introns, and while they are rarely in coding-regions they may be on the reverse strand of coding regions. Thus, this huge search space demands fast algorithms.Searching Genomes for Noncoding RNA CS374 Fall 2004 Lecture 3, 10/3/06 Lecturer: Leticia Britos Scribe: Greg Goldgof 5 Figure 2: 2D and 3D structure. The image on the left is an ex-ample of the 2D structure of a tRNA molecule. The two images on the right are depictions of the 3D structure of a tRNA molecule. 3. Searching Genomes for Noncoding RNA Using FastR If sequence conservation and structural conservation are both inadequate for noncodingRNA finding then perhaps a combination approach may be best. This is the approach taken by Zhang et al. in their gene predictor FastR, which looks for


View Full Document

Stanford CS 374 - Lecture 3

Documents in this Course
Probcons

Probcons

42 pages

ProtoMap

ProtoMap

19 pages

Load more
Download Lecture 3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?