DOC PREVIEW
Gene Ontology: tool for the unification of biology

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

commentarynature genetics • volume 25 • may 200025Gene Ontology: tool for the unification of biologyThe Gene Ontology Consortium**Michael Ashburner1, Catherine A. Ball3, Judith A. Blake4, David Botstein3, Heather Butler1, J. Michael Cherry3, Allan P. Davis4, Kara Dolinski3, Selina S.Dwight3, Janan T. Eppig4, Midori A. Harris3, David P. Hill4, Laurie Issel-Tarver3, Andrew Kasarskis3, Suzanna Lewis2, John C. Matese3, Joel E. Richardson4,Martin Ringwald4, Gerald M. Rubin2& Gavin Sherlock31FlyBase (http://www.flybase.bio.indiana.edu). 2Berkeley Drosophila Genome Project (http://fruitfly.bdgp.berkeley.edu). 3Saccharomyces Genome Database(http://genome-www.stanford.edu). 4Mouse Genome Database and Gene Expression Database (http://www.informatics.jax.org). Correspondence should beaddressed to J.M.C. (e-mail: [email protected]) and D.B. (e-mail: [email protected]), Department of Genetics, Stanford University School ofMedicine, Stanford, California, USA.The accelerating availability of molecular sequences, particularlythe sequences of entire genomes, has transformed both the the-ory and practice of experimental biology. Where once bio-chemists characterized proteins by their diverse activities andabundances, and geneticists characterized genes by the pheno-types of their mutations, all biologists now acknowledge thatthere is likely to be a single limited universe of genes and proteins,many of which are conserved in most or all living cells. Thisrecognition has fuelled a grand unification of biology; the infor-mation about the shared genes and proteins contributes to ourunderstanding of all the diverse organisms that share them.Knowledge of the biological role of such a shared protein in oneorganism can certainly illuminate, and often provide stronginference of, its role in other organisms.Progress in the way that biologists describe and conceptualizethe shared biological elements has not kept pace with sequencing.For the most part, the current systems of nomenclature for genesand their products remain divergent even when the experts appre-ciate the underlying similarities. Interoperability of genomic data-bases is limited by this lack of progress, and it is this major obstaclethat the Gene Ontology (GO) Consortium was formed to address.Functional conservation requires a common languagefor annotationNowhere is the impact of the grand biological unification moreevident than in the eukaryotes, where the genomic sequences ofthree model systems are already available (budding yeast, Sac-charomyces cerevisiae, completed in 1996 (ref. 1); the nematodeworm Caenorhabditis elegans, completed in 1998 (ref. 2); andthe fruitfly Drosophila melanogaster, completed earlier thisyear3) and two more (the flowering plant Arabidopsis thaliana4and fission yeast Schizosaccharomyces pombe) are imminent. Thecomplete genomic sequence of the human genome is expected ina year or two, and the sequence of the mouse (Mus musculus)will likely follow shortly thereafter.The first comparison between two complete eukaryoticgenomes (budding yeast and worm5) revealed that a surpris-ingly large fraction of the genes in these two organisms dis-played evidence of orthology. About 12% of the worm genes(∼18,000) encode proteins whose biological roles could beinferred from their similarity to their putative orthologues inyeast, comprising about 27% of the yeast genes (∼5,700). Mostof these proteins have been found to have a role in the ‘core bio-logical processes’ common to all eukaryotic cells, such as DNAreplication, transcription and metabolism. A three-way com-parison among budding yeast, worm and fruitfly shows thatthis relationship can be extended; the same subset of yeast genesgenerally have recognizable homologues in the fly genome6.Estimates of sequence and functional conservation between thegenes of these model systems and those of mammals are lessreliable, as no mammalian genome sequence is yet known in itsentirety. Nevertheless, it is clear that a high level of sequenceand functional conservation will extend to all eukaryotes, withthe likelihood that genes and proteins that carry out the corebiological processes will again be probable orthologues. Fur-thermore, since the late 1980s, many experimental confirma-tions of functional conservation between mammals and modelorganisms (commonly yeast) have been published7–12.This astonishingly high degree of sequence and functionalconservation presents both opportunities and challenges. Themain opportunity lies in the possibility of automated transferof biological annotations from the experimentally tractablemodel organisms to the less tractable organisms based on geneand protein sequence similarity. Such information can be usedto improve human health or agriculture. The challenge lies inmeeting the requirements for a largely or entirely computa-tional system for comparing or transferring annotationamong different species. Although robust methods forsequence comparison are at hand13–15, many of the other ele-ments for such a system remain to be developed.Genomic sequencing has made it clear that a large fraction of the genes specifying the core biologicalfunctions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in oneorganism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is toproduce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge ofgene and protein roles in cells is accumulating and changing. To this end, three independent ontologiesaccessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biologicalprocess, molecular function and cellular component.© 2000 Nature America Inc. • http://genetics.nature.com© 2000 Nature America Inc. • http://genetics.nature.comA dynamic gene ontologyThe GO Consortium is a joint project ofthree model organism databases: FlyBase16,Mouse Genome Informatics17,18(MGI) andthe Saccharomyces Genome Database19(SGD). It is expected that other organismdatabases will join in the near future. Thegoal of the Consortium is to produce astructured, precisely defined, common, con-trolled vocabulary for describing the roles ofgenes and gene products in any organism.Early considerations of the problems posedby the diversity of activities that characterizethe cells of yeast, flies and mice made it clearthat extensions of standard indexing meth-ods (for example, keywords) are likely to


Gene Ontology: tool for the unification of biology

Download Gene Ontology: tool for the unification of biology
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Gene Ontology: tool for the unification of biology and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Gene Ontology: tool for the unification of biology 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?