DOC PREVIEW
Stanford CS 374 - Computational Genomics

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Cell, Vol. 109, 137–140, April 19, 2002, Copyright 2002 by Cell PressMinireviewComputational Genomicsof Noncoding RNA GenesncRNAs as they have proved for protein gene analysis?What are the prospects for tabulating and annotatingncRNA genes in genome sequences?Sean R. Eddy1Howard Hughes Medical InstituteDepartment of GeneticsWashington University School of Medicine What We’re Looking forNoncoding RNA genes come in more than one flavorSaint Louis, Missouri 63110(Eddy, 2001; Erdmann et al., 2000; see other minireviewsin this issue). This makes it difficult to imagine a singleBiologists should not deceive themselves with thethought that some new class of biological molecules, ncRNA genefinding approach to find them all.The best known ncRNAs have complex three-dimen-of comparable importance to proteins, remains to bediscovered. This seems highly unlikely. sional RNA structures and play roles as catalytic orstructural parts of RNA-protein machines; examples in-—F. Crick (1958)clude transfer RNA, ribosomal RNA, and spliceosomalRNAs. Many other ncRNAs, especially many of the re-cently discovered ones, act in a relatively unsophisti-cated manner by base pairing to a target RNA, andThe number of known noncoding RNA genes is ex-either regulate gene expression directly (for instance, bypanding rapidly. Computational analysis of genomesterically occluding a ribosome binding site), or providesequences, which has been revolutionary for proteinRNA targeting specificity for a protein-based regulatorygene analysis, should also be able to address ques-or modification mechanism; examples include the microtions of the number and diversity of noncoding RNAgenes. However, noncoding RNAs present computa- RNAs (miRNAs) (Ambros, 2001), E. coli translational reg-tional genomics with a new set of challenges.ulatory RNAs (Wassarman et al., 1999), and small nucle-olar RNAs (Eliceiri, 1999). Other noncoding RNAs seemparticularly prevalent in dosage compensation, such asWe often hear that we live in the postgenomic world,Xist RNA in vertebrates or roX RNAs in Drosophila, andwhere “all genes” have been systematically tabulatedin databases, but the idea of a complete enumeration in imprinted regions of chromosomes, such as IPW (im-is just a convenient fable. Even with genome sequencesprinted in Prader-Willi) and H19 (Kelley and Kuroda,in hand, our ability to identify genes is largely limited to2000). The “cis-antisense” ncRNAs are transcribed fromrelatively large, evolutionarily conserved, moderately tothe opposite strand of protein-coding genes, overlap-highly expressed protein coding genes. We know thereping one or more coding exons in an antisense arrange-are exceptions that fly below our radar—tiny genes,ment; an example with a genetic phenotype is the humanrapidly evolving genes, genes expressed in only a fewSCA8 ncRNA gene, which is mutated in one form ofcells at special times—but we have hoped, with somespinal cerebellar ataxia (Nemes et al., 2000).justification, that they aren’t too important or numerous.Isolation of a new RNA species with no significantPerhaps nowhere else are the tools and assumptionsORF is not sufficient evidence of a new ncRNA gene.of genefinding and genome sequence analysis moreMany different cellular processes throw off nongenicfundamentally challenged than in the rapidly developingnoncoding RNA species, including RNA processing in-field of noncoding RNA genes.termediates, transcription from retroposed repetitive el-Noncoding RNA (ncRNA) genes make transcripts thatements, and low-level background genomic transcrip-function directly as RNA, rather than encoding proteins.tion. There should be evidence of a function, either byTransfer RNA and ribosomal RNA are textbook exam-computational means (e.g., sequence or structure con-ples. Other structural and regulatory ncRNAs are known,servation) or experimental means (e.g., genetic pheno-but their number and importance have seemed marginal.type). There should also be evidence that the RNA doesOf course, absence of evidence is not evidence of ab-not code for a small protein. Here, the best evidence issence. Gene discovery methods are biased. Most as-almost certainly comparative genome sequence analy-sume the Central Dogma, and look for genes that makesis. Conserved coding regions generally show a verymessenger RNAs and have open reading frames. Whatdifferent pattern of mutation (e.g., synonymous codonif we looked specifically for noncoding RNA genes? Aschanges) compared to noncoding RNAs, and this pat-described in the following minireviews, several lines oftern can be obvious even for short ORFs. Several storiesrecent research suggest that there are many noncodingin the literature in which ncRNAs have been confusedRNA genes that have evaded genetic, biochemical, andwith genes for small proteins, and vice versa, have beenmolecular detection until now.rectified by use of comparative sequence analysisThe power of complete genomes and computational(Eddy, 2001).sequence analysis has revolutionized molecular genet-Genefindingics. The workhorses of sequence comparison, BLASTSeveral promising approaches give us a tenuous claw-and FASTA, are as well known as PCR. We are accus-hold on the problem of de novo ncRNA gene prediction.tomed to browsing—and sometimes even believing—None of these approaches is yet as reliable as protein-gene predictions made by genefinding programs. Willcoding genefinders.computational genome analysis tools prove as useful forOne approach to computational ncRNA genefindingis to predict RNA transcript initiation, termination, andprocessing, and find all predicted transcripts that do1Correspondence: [email protected] 1. Comparison of Four Screens for ncRNAs in E. coliUniquelyStrategy Reference Candidates Tested Expressed #/31 found/31Promoter/terminator/seq conservation Argaman et al. (2001) 24 23 14 14 2Seq conservation/microarray Wassarman et al. (2001) 60 60 17 18 2Sequence composition/structure stability Carter et al. (2001) 370 — — 13 —Comparative secondary structure Rivas et al. (2001) 275 49 11 22 6The table shows the number of predicted ncRNA genes, the number tested for expression by Northern blot, and the number found to beexpressed. The total number of different expressed ncRNA genes identified by these screens was 31. The final two columns indicate howmany RNAs out of this total were identified by each screen, and the number of unique RNAs found by each method.


View Full Document

Stanford CS 374 - Computational Genomics

Documents in this Course
Probcons

Probcons

42 pages

ProtoMap

ProtoMap

19 pages

Lecture 3

Lecture 3

16 pages

Load more
Download Computational Genomics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computational Genomics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computational Genomics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?