DOC PREVIEW
61 2006 genetics marshall

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Copyright Ó 2006 by the Genetics Society of AmericaDOI: 10.1534/genetics.105.053314A Bayesian Heterogeneous Analysis of Variance Approach to InferringRecent Selective SweepsJohn M. Marshall*,1and Robert E. Weiss†*Department of Biomathematics, UCLA School of Medicine, Los Angeles, California 90095-1766 and†Department of Biostatistics,UCLA School of Public Health, Los Angeles, California 90095-1772Manuscript received November 9, 2005Accepted for publication May 31, 2006ABSTRACTThe distribution of microsatellite allele sizes in populations aids in understanding the genetic diversityof species and the evolutionary history of recent selective sweeps. We propose a heterogeneous Bayesiananalysis of variance model for inferring loci involved in recent selective sweeps by analyzing the dis-tribution of allele sizes at multiple loci in multiple populations. Our model is shown to be consistent witha multilocus test statistic, ln RV, proposed for identifying microsatellite loci involved in recent selectivesweeps. Our methodology differs in that it accepts original allele size data rather than summary statisticsand allows the incorporation of prior knowledge about allele frequencies using a hierarchical priordistribution consisting of log normal and gamma probability distributions. Interesting features of themodel are its ability to simultaneously analyze allele size data for any number of populations and to copewith the presence of any number of selected loci. The utility of the method is illustrated by application totwo sets of microsatellite allele size data for a group of West African Anopheles gambiae populations. Theresults are consistent with the suppressed-recombination model of speciation, and additional candidateloci on chromosomes 2 (079 and 175) and 3 (088) are discovered that escaped former analysis.UNDERSTANDING which regions of the genomehave been acted on by selection facilitates ourunderstanding of the genetic basis of species-specificdifferences and allows us to identify genomic regionsof functional and medical importance. Over the last fewdecades, various approaches for identifying genes astargets of selection have been proposed. Some of theseapproaches require prior knowledge of the locationand function of candidate genes, while other methods,such as QTL mapping, require prior knowledge of thephenotypic trait of adaptive relevance and its pattern ofheredity (Lange 1997).Through the availability of completely sequenced ge-nomes and the advent of genomewide scanning, it hasbecome unnecessary to have prior knowledge of a geno-mic region to infer whether or not it has been the targetof selection (Luikart 2003). A number of tests of neu-trality have been proposed that are based purely onallelic distributions and levels of variability (Nielsen2001). These are based on variability at a single locus(Ewens 1972; Tajima 1989), allelic variability at multi-ple loci (Lewontin and Krakauer 1973; Hudson et al.1987; Schlo¨ tterer 2001), and comparisons of vari-ability or divergence between different classes of muta-tions within a locus (McDonald and Kreitman 1991;Goldman and Yang 1994).Tests of neutrality based on a single locus, such asTajima’s D (Tajima 1989), run into difficulties becauseit is difficult to distinguish between a reduction ofvariance in allele size due to selection and a reductiondue to a population bottleneck (Simonsen et al. 1995).Such tests run the risk of becoming tests of the equi-librium neutral population model rather than tests ofselective neutrality. Tests of neutrality based on multipleloci, such as the HKA test (Hudson et al. 1987) and theln RV test (Schlo¨ tterer 2001), avoid these concerns.This is because, while neutral loci are similarly affectedby demography and evolutionary history, the distribu-tion of alleles in selected loci is affected differently fromneutral loci and hence displays outlier patterns.Hunting for selected loci can be done using a varietyof natural genetic markers. Two common families ofmarkers used for detecting selective sweeps are micro-satellites and SNPs. Most research to date has been con-ducted using microsatellites, which, while less prolificthan SNPs, have the benefit of being multiallelic mark-ers and hence are highly informative (Schlo¨ ttererand Wiehe 1999). Microsatellites are tandem repeatsof short DNA segments that are typically between 1 and5 bp in length, and their alleles are defined by the num-ber of DNA segment repeats that are present at a par-ticular locus.1Corresponding author: Department of Biomathematics, UCLA Schoolof Medicine, Box 951766, Los Angeles, CA 90095-1766.E-mail: [email protected] 173: 2357–2370 (August 2006)The number of tandem repeats in a microsatelliteallele at a specific locus is highly variable due to a numberof factors, but primarily due to slippage during DNAreplication (Slatkin 1995). Slippage rates vary from locusto locus, and hence locus-specific mutation rates deter-mine the characteristic variance in allele size at a givenmicrosatellite locus in a given population (Schlo¨ ttereret al. 1997).Another process affecting the number of tandem re-peats at a given locus is the hitchhiking of a micro-satellite allele to a selected gene (Maynard Smith andHaigh 1974). Even though microsatellites are unlikelyto be the target of selection themselves, a microsatellitelocus closely linked to a beneficial mutation will beselected for along with the beneficial mutation, de-creasing the variance in allele size at the microsatellitelocus adjacent to the site of the selected gene (Wiehe1998). Thus looking for loci in populations with lessvariance in allele size than expected can be used as amethod for identifying chromosomal regions that havebeen the target of selection. If all loci in a given pop-ulation show less allele size variance than expected,this implies that a population bottleneck could haveoccurred.One method that has recently been proposed foridentifying chromosomal regions that have been actedon by selection is the ln RV statistic (Schlo¨ tterer 2001).The ln RV statistic is equal to the natural logarithm ofthe ratio of observed variances in repeat number atan individual microsatellite locus in two populations.Denoting the locus by j and the populations by i1andi2, the ln RV statistic may be represented mathematicallyasln RVi1i2j¼ logs2i1js2i2j !: ð1ÞAssuming the stepwise mutation model (Ohta andKimura 1973), neutrality, and mutation-drift equilibrium,then from standard


61 2006 genetics marshall

Download 61 2006 genetics marshall
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 61 2006 genetics marshall and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 61 2006 genetics marshall 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?