UMD CMSC 828G - A comparison of random sequence reads versus 16S rDNA sequences

Unformatted text preview:

5180–5188 Nucleic Acids Research, 2008, Vol. 36, No. 16 Published online 5 August 2008doi:10.1093/nar/gkn496A comparison of random sequence readsversus 16S rDNA sequences for estimating thebiodiversity of a metagenomic libraryChaysavanh Manichanh1,2,*, Charles E. Chapple3, Lionel Frangeul4,Karine Gloux5, Roderic Guigo2,3and Joel Dore51Digestive System Research Unit, University Hospital Vall d’Hebron, Ciberehd,2Bioinformatics and GenomicsProgram, Center for Genomic Regulation, Barcelona,3Genome Bioinformatics Laboratory, GRIB—IMIM/UPF,E-08003 Barcelona, Spain,4Genopole, Institut Pasteur, Paris and5INRA, UR910, F-78352 Jouy-en-Josas, FranceReceived December 14, 2007; Revised July 11, 2008; Accepted July 17, 2008ABSTRACTThe construction of metagenomic libraries has per-mitted the study of microorganisms resistant to iso-lation and the analysis of 16S rDNA sequences hasbeen used for over two decades to examine bacte-rial biodiversity. Here, we show that the analysis ofrandom sequence reads (RSRs) instead of 16S is asuitable shortcut to estimate the biodiversity of abacterial community from metagenomic libraries.We generated 10 010 RSRs from a metagenomiclibrary of microorganisms found in human faecalsamples. Then searched them using the programBLASTN against a prokaryotic sequence databaseto assign a taxon to each RSR. The results werecompared with those obtained by screening andanalysing the clones containing 16S rDNAsequences in the whole library. We found that thebiodiversity observed by RSR analysis is consistentwith that obtained by 16S rDNA. We also show thatRSRs are suitable to compare the biodiversity bet-ween different metagenomic libraries. RSRs canthus provide a good estimate of the biodiversity ofa metagenomic library and, as an alternative to 16S,this approach is both faster and cheaper.INTRODUCTIONWe live in a world dominated by microorganisms (1).However, very little is known about the role they play inour environment. One of the main questions that remainsto be answered is how these microorganisms competeand communicate between themselves to get nutrientsand produce energy in an ecosystem. To address thisquestion, one has to overcome the limitations associa-ted with the ‘uncultivability’ of at least 99% of themicroorganisms in nature (2). The development of cul-ture-independent methods applied to environmental sam-ples was a turning point for the field. In 1985, Pace andcolleagues (3) were the first to propose direct analysis of5S and 16S rRNA gene sequences to describe the micro-bial diversity in an environmental sample without cultur-ing. The 16S rRNA gene is highly conserved among allmicroorganisms, is of suitable length (about 1500 bp) forbioinformatic analysis and is an excellent molecule fordiscerning evolutionary relationships among prokaryoticorganisms (4). For all these reasons, this molecule hasgiven rise to a huge public database (RDPII: http://rdp.cme.msu.edu/containing 481 650 16S rRNAs, 13 February2008) (5). Finally, defining phylotype (or species) on thebasis of 16S rDNA sequences has been and remains theaccepted standard for studies of uncultured microorgan-ism diversity (6–10).These molecular tools have revealed a wider microbialdiversity than expected in several ecosystems (11,12). Thefunctions, however, of the different groups of microorgan-isms are largely unknown. Pace proposed the first cloningof genomic DNA directly from environmental samplesusing a phage vector (13). Later, this approach, calledmetagenomics, inspired other groups to penetrate themicrobial world from all sources including human faeces,whale falls, soil, marine and other aquatic ecosystems(14–18). Metagenomics, conducted on a massive scale,has provided dramatic insights into the structure and meta-bolic potential of microbiota (also used for microbial popu-lation) (19,20). Functional screening of metagenomiclibraries has led to the assignment of functions to numerous‘hypothetical proteins’, so far demonstrating the powerof functional metagenomics (21). Metagenomics is anewly emerging technology, and has generated more than100 projects in the GOLD Web site, Genomes OnLineDatabase (February 2008, http://www.genomesonline.org/gold.cgi), 31 of which have already been completed.*To whom correspondence should be addressed. Tel: +34 933 160 167; Fax: +34 933 160 019; Email: [email protected]ß 2008 The Author(s)This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.One of the approaches allowing the classification ofmetagenomic fragments is the sequence-composition-based method. It relies on the analyses of oligonucleotidefrequencies that vary significantly among genomes, per-mitting discrimination of different species (22,23). Thisapproach, which needs a training process in using genomicsequences available in databases, has been the method ofchoice for some analyses of microbial communities inrecent years (24–26) and has been used in different soft-ware such as TETRA or PhyloPythia (27,28). However,it encounters limitations not only in the availability ofgenomic sequences in databases for their learning process,but also in the size of the analysed metagenomic frag-ments. As discussed by the authors themselves, thesequence-composition-based approach needs complemen-tary methods to analyse short metagenomic fragments(<1 kb) such as single-read end-sequences.Another approach to study microbial diversity is alarge-scale screening for clones or contigs containing aphylogenetic gene marker such as 16S rRNA gene.To that end, clones harbouring 16S can be screened byseveral methods. The first consists of the extraction ofthe recombinant vectors to remove the genome of theorganism in which the cloning has been performed, thenselection of the 16S rRNA gene by DNA–DNA hybridi-zation on a macroarray (18). The second method involvesthe massive sequencing of the whole-metagenome andsubsequent in silico identification of the 16S rDNAsequences (16). PCR-based 16S rRNA gene sequence-based analysis, an approach that has been widely usedin the literature to analyse the diversity of microbial com-munities could have been applied directly on the environ-mental samples. However, due to PCR bias and the


View Full Document

UMD CMSC 828G - A comparison of random sequence reads versus 16S rDNA sequences

Documents in this Course
Lecture 2

Lecture 2

35 pages

Load more
Download A comparison of random sequence reads versus 16S rDNA sequences
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A comparison of random sequence reads versus 16S rDNA sequences and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A comparison of random sequence reads versus 16S rDNA sequences 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?