DOC PREVIEW
BYU BIO 465 - computational system to select candidate genes for complex human traits

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Vol. 23 no. 9 2007, pages 1132–1140BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm001Data and text miningAcomputationalsystemtoselectcandidategenesfor complex human traitsKyle J. Gaulton1,2,3,!, Karen L. Mohlke3and Todd J. Vision41Curriculum in Genetics and Molecular Biologly,2Bioinformatics and Computational Biology Training Program,Departments of3Genetics and4Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516Received on October 30, 2006; revised on January 2, 2007; accepted on January 8, 2007Advance Access publication January 19, 2007Associate Editor: John QuackenbushABSTRACTMotivation: Identification of the genetic variation underlyingcomplex traits is challenging. The wealth of information publiclyavailable about the biology of complex traits and the function ofindividual genes permits the development of informatics-assistedmethods for the selection of candidate genes for these traits.Results: We have developed a computational system namedCAESAR that ranks all annotated human genes as candidates fora complex trait by using ontologies to semantically map naturallanguage descriptions of the trait with a variety of gene-centricinformation sources. In a test of its effectiveness, CAESARsuccessfully selected 7 out of 18 (39%) complex human traitsusceptibility genes within the top 2% of ranked candidatesgenome-wide, a subset that represents roughly 1% of genes in thehuman genome and provides sufficient enrichment for an associa-tion study of several hundred human genes. This approach can beapplied to any well-documented mono- or multi-factorial trait in anyorganism for which an annotated gene set exists.Availability: CAESAR scripts and test data can be downloaded fromhttp://visionlab.bio.unc.edu/caesar/Contact: [email protected] INTRODUCTIONUnlike Mendelian traits, in which a mutation in one gene iscausative, or oligogenic traits, where several genes are sufficientbut not necessary, complex traits are caused by variation inmultiple genetic and environmental factors, none of which aresufficient to cause the trait (Peltonen and McKusick, 2001).The contribution of any given gene to a complex trait isusually modest. In addition, complex traits often encompassa variety of phenotypes and biological mechanisms, making itdifficult to determine which genes to study (Newton-Cheh andHirschhorn, 2005).As a result, traditional methods of genetic discovery, suchas linkage analysis and positional cloning, while widelysuccessful in identifying the genes for Mendelian traits, havehad more limited success in identifying genes for complex traits.Candidate gene studies have had encouraging success, yet thisapproach requires an effective method for deciding a prioriwhich genes have the greatest chance of influencing suscepti-bility to the trait (Dean, 2003). Recent advances in genotypingtechnology have provided researchers with the ability to testassociation in hundreds of genes relatively quickly, and eventhe entire genome through a genome-wide association study.Genome-wide association studies are promising, yet not alwayseconomically feasible or statistically desirable (Thomas, 2006).Therefore, one of the greatest challenges in disease associa-tion study design remains the intelligent selection ofcandidate genes.To this end, we have developed a computational methodo-logy, named CAESAR (CAndidatE Search And Rank), thatuses text and data mining to rank genes according to potentialinvolvement in a complex trait. CAESAR exploits the knowl-edge of complex traits in literature by using ontologies tosemantically map the trait information to gene and protein-centric information from several different public data sources,including tissue-specific gene expression, conserved proteindomains, protein–protein interactions, metabolic pathways andthe mutant phenotypes of homologous genes. CAESAR usesfour possible methods of integration to combine the results ofdata searches into a prioritized candidate gene list. In effect,CAESAR mimics the steps a researcher would undertake inselecting candidate genes, albeit faster, potentially morethoroughly, and in a more quantitative manner.CAESAR represents a novel selection strategy in that itcombines text and data mining to associate genetic informationwith extracted trait knowledge in order to prioritize candidategenes. In contrast to a number of existing approaches(Adie et al., 2006; Turner et al., 2003; van Driel et al., 2003),gene selection is not limited to one or more genomic regions,as all genes annotated in one of our databases are potentialcandidates. CAESAR is ultimately designed for traits inwhich the relevant biological processes may not be wellunderstood and potentially hundreds of reasonable candidategenes exist.The potential benefits to a researcher in adopting acomputational approach to gene selection such as CAESARinclude the ability to quickly and systematically processseveral hundred thousand biological annotations, many of*To whom correspondence should be addressed.! 2007 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. by guest on February 19, 2011bioinformatics.oxfordjournals.orgDownloaded fromwhich require highly specialized domain expertise to interpret.This benefit will continue to grow in importance as thevolume and technical detail of annotation data increases.Relevant gene annotations can easily escape human considera-tion due to biases that investigators bring to the task ofprioritization and that are difficult to overcome even byconscious effort. This is particularly valuable for complextraits, which may be affected by a wider array of biologicalprocesses, some of which may not have been directly implicatedby previous studies. CAESAR also reports the evidencesupporting the prioritization rank of each gene, allowing aninvestigator to trace the line of reasoning and to exercise his orher own judgment as to its validity. Thus, it can be seen as avery sophisticated aid to manual prioritization.Though designed to help with the design of an associationstudy involving a few hundred genes, CAESAR can also beused to prioritize a smaller number of candidates within aregion of linkage, or to prioritize among polymorphismsannotated with


View Full Document

BYU BIO 465 - computational system to select candidate genes for complex human traits

Documents in this Course
summary

summary

13 pages

Cancer

Cancer

8 pages

Ch1

Ch1

5 pages

GNUMap

GNUMap

20 pages

cancer

cancer

8 pages

SNPs

SNPs

22 pages

Load more
Download computational system to select candidate genes for complex human traits
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view computational system to select candidate genes for complex human traits and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view computational system to select candidate genes for complex human traits 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?