DOC PREVIEW
UMD CMSC 828G - Metagenomics: Read Length Matters

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Mar. 2008, p. 1453–1463 Vol. 74, No. 50099-2240/08/$08.00⫹0 doi:10.1128/AEM.02181-07Copyright © 2008, American Society for Microbiology. All Rights Reserved.Metagenomics: Read Length Matters䌤†K. Eric Wommack,1Jaysheel Bhavsar,1and Jacques Ravel2*Delaware Biotechnology Institute, University of Delaware, 15 Innovation Way, Newark, Delaware 19711,1and Institute forGenome Sciences, Department of Microbiology and Immunology, University of Maryland School ofMedicine, 20 Penn Street, Baltimore, Maryland 212012Received 24 September 2007/Accepted 3 January 2008Obtaining an unbiased view of the phylogenetic composition and functional diversity within a microbialcommunity is one central objective of metagenomic analysis. New technologies, such as 454 pyrosequencing,have dramatically reduced sequencing costs, to a level where metagenomic analysis may become a viablealternative to more-focused assessments of the phylogenetic (e.g., 16S rRNA genes) and functional diversity ofmicrobial communities. To determine whether the short (⬃100 to 200 bp) sequence reads obtained frompyrosequencing are appropriate for the phylogenetic and functional characterization of microbial communi-ties, the results of BLAST and COG analyses were compared for long (⬃750 bp) and randomly derived shortreads from each of two microbial and one virioplankton metagenome libraries. Overall, BLASTX searchesagainst the GenBank nr database found far fewer homologs within the short-sequence libraries. This wasespecially pronounced for a Chesapeake Bay virioplankton metagenome library. Increasing the short-readsampling depth or the length of derived short reads (up to 400 bp) did not completely resolve the discrepancyin BLASTX homolog detection. Only in cases where the long-read sequence had a close homolog (low BLASTE-score) did the derived short-read sequence also find a significant homolog. Thus, more-distant homologs ofmicrobial and viral genes are not detected by short-read sequences. Among COG hits, derived short readssampled at a depth of two short reads per long read missed up to 72% of the COG hits found using long reads.Noting the current limitation in computational approaches for the analysis of short sequences, the use ofshort-read-length libraries does not appear to be an appropriate tool for the metagenomic characterization ofmicrobial communities.Sequence polymorphism analysis of discrete genes withinenvironmental samples has revolutionized our view of the di-versity and the composition of microbial communities. Since itwas first proposed as a universal phylogenetic marker of life onearth (29), sequence analysis of the small-subunit rRNA genehas become the gold standard for the assessment of microbialdiversity within environmental samples. Today, a plethora oftechniques for the assessment of microbial species richness andevenness are based on sequence polymorphism within thissingle gene (see reference 9 for a review). More recently, theconceptual approaches developed for the analysis of 16SrRNA gene microbial diversity have been applied to functionalgenes involved in chemical transformations critical to the car-bon (e.g., RuBisCo [6]), nitrogen (e.g., NifH [31]), and sulfur(sulfite reductase [18]) cycles. These analytical approacheshave revealed a significant diversity of microorganisms that arecapable of mediating the chemical transformations that main-tain global biogeochemical nutrient cycles. Despite the ex-traordinary view of microbial taxonomic and functional diver-sity that single-gene approaches provide, these approacheshave two significant limitations, namely, the inability to providea picture of the broader genomic context of a given gene ofinterest and the requirement of prior sequence informationnecessary for the design of oligonucleotide PCR primers andprobes.The ideal technique for the assessment of microbial diversitywould circumvent the need for selective PCR amplification andprovide sequence information of sufficient length to discernconnections between the taxonomy and physiology of everymicrobe within the community. At present, high-throughputsequencing applied to whole-microbial-community DNA, acollective suite of techniques known as microbial metagenom-ics, is the only approach capable of nearing this holistic view ofthe taxonomic and functional diversity within extant microbialcommunities. To date, microbial metagenomic investigationsof marine ecosystems have revealed the enormous diversity ofpotentially photoheterotrophic prokaryotes in the SargassoSea (27) and the pervasiveness of cyanophages within the eu-photic zone of the pelagic ocean (8). Sequencing of small-insert shotgun libraries of microbial community DNA fromlow-pH acid mine drainage (AMD) environments (26) and thesymbiotic microbial flora of a gutless marine oligochaete (30)have enabled the nearly complete assembly of microbial ge-nomes without the necessity of cultivation. Although signifi-cantly smaller in overall scale, shotgun sequencing of viralDNA within environmental samples has revealed that commu-nities of double-stranded DNA viruses are very diverse (3, 5)and contain an extraordinary amount of novel sequence (2).To date, most metagenomic investigations have adaptedwhole-genome shotgun sequencing approaches to the cloningand sequencing of microbial community DNA collected fromenvironmental samples. In this approach, small-insert DNA* Corresponding author. Mailing address: Institute for Genome Sci-ences, Department of Microbiology and Immunology, University ofMaryland School of Medicine, 20 Penn Street, Baltimore, MD 21201.Phone: (410) 706-5674. Fax: (410) 706-1482. E-mail: [email protected].† Supplemental material for this article may be found at http://aem.asm.org/.䌤Published ahead of print on 11 January 2008.14531454 WOMMACK ET AL. APPL.ENVIRON.MICROBIOL.clone libraries are analyzed using Sanger dideoxy chain termi-nator sequencing (22) to yield DNA sequence libraries con-sisting of sequence reads ranging from ca. 600 to 900 bp inlength with accuracies exceeding 99.97%. The limitations ofthis approach are the overall cost of sequencing and the po-tential biases introduced in constructing clone libraries. Re-cently, a novel sequencing-by-synthesis technology, 454 pyro-sequencing, was introduced which dramatically lowers theper-base pair cost of sequencing and circumvents the need forclone library construction (17). Prior metagenome investiga-tions


View Full Document

UMD CMSC 828G - Metagenomics: Read Length Matters

Documents in this Course
Lecture 2

Lecture 2

35 pages

Load more
Download Metagenomics: Read Length Matters
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Metagenomics: Read Length Matters and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Metagenomics: Read Length Matters 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?