DOC PREVIEW
CMU CS 10810 - Lecture

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Computational GenomicsComputational GenomicsPopulation Genetics:Population Genetics:Quantitative Trait Locus (QTL) Quantitative Trait Locus (QTL) MappingMappingEric XingEric XingLecture 4, January 25, 2007Reading: DTW book, Chap 13 Phenotypical Traits z Body measures:z Disease susceptibility and drug responsez Gene expression (microarray)2Backcross experimentF2intercross experiment3Trait distributions: a classical viewNote the equivalent of dominance in our trait distributions.Another representation of a trait distribution4Note the approximate additivity in our trait distributions here.A second exampleQTL mappingz Dataz Phenotypes: yi= trait value for mouse iz Genotype: xij= 1/0 (i.e., A/H) of mouse i at marker j(backcross);need three states for intercrossz Genetic map: Locations of markersz Goals z Identify the (or at least one) genomic region, called quantitative trait locus = QTL, that contributes to variation in the traitz Form confidence intervals for the QTL location z Estimate QTL effects5QTL mapping (BC)QTL mapping (F2)6Models: Recombinationz We assume no chromatid or crossover interference.⇒ points of exchange (crossovers) along chromosomes are distributed as a Poisson process, rate 1 in genetic distance⇒ the marker genotypes {xij} form a Markov chain along the chromosome for a backcross; what do they form in an F2 intercross?Models: Genotype → Phenotypez Let y = phenotype, g = whole genome genotypez Imagine a small number of QTL with genotypes g1,…., gp(2por 3pdistinct genotypes for BC, IC resp, why?). We assumeE(y|g) = µ(g1,…gp), var(y|g) = σ2(g1,…gp)7Models: Genotype → Phenotypez Homoscedacity (constant variance) σ2(g1,…gp) = σ2 (constant)z Normality of residual variationy|g ~ N(µg,σ2 )z Additivity: µ(g1,…gp)= µ+ ∑∆jgj(gj= 0/1 for BC)z Epistasis: Any deviations from additivity.µ(g1,…gp)= µ+ ∑∆jgj+∑ωijgigjThe effect of QTL 1 is the same, irrespective of the genotype of QTL 2, and vice versa.1∆2∆Epistatic QTLs)|(~jigp∆Additivity, or non-additivity (BC)8Additivity or non-additivity: F2● Split subjects into groups according to genotype at a single marker● Do a t-test/ANOVA● Repeat for each marker● LOD score = log10likelihood ratio, comparing single-QTL model to the “no QTL anywhere” model.t-test/ANOVA will tell whether there is sufficient evidence to say that measurements from one condition (i.e., genotype) differ significantly from another The simplest method: ANOVA9Advantages•Simple• Easily incorporate covariates (sex, env, treatment ...)• Easily extended to more complex modelsDisadvantages• Must exclude individuals with missing genotype data• Imperfect information about QTL location• Suffers in low density scans• Only considers one QTL at a timeANOVA at marker lociInterval mapping (IM)z Consider any one position in the genome as the location for a putative QTLz For a particular mouse, let z = 1/0 if (unobserved) genotype at QTL is AB/AAz Calculate Pr(z = 1 | marker data of an interval bracketing the QTL)z Assume no meiotic interferencez Need only consider flanking typed markersz May allow for the presence of genotyping errorsz Given genotype at the QTL, phenotype is distributed asyi| zi~ Normal( µzi, σ2 )z Given marker data, phenotype follows a mixture of normal distributions10AA AB ABIM: the mixture model● Use a version of the EM algorithm to obtain estimates of µAA, µAB, and σ (an iterative algorithm)● Calculate the LOD score● Repeat for all other genomic positions (in practice, at 0.5 cM steps along genome){})QTL no|data()ˆ,ˆ|data(10log=LODPPABAAµµIM: estimation and LOD scores11LOD score curvesLOD thresholdsz To account for the genome-wide search, compare the observed LOD scores to the distribution of the maximum LOD score, genome-wide, that would be obtained if there were no QTL anywhere.z LOD threshold = 95th %ile of the distribution of genome-wide maxLOD, when there are no QTL anywherez Derivations: z Analytical calculations (Lander & Botstein, 1989)z Simulations z Permutation tests (Churchill & Doerge, 1994).12Permutation distribution for trait4Advantages• Make proper account of missing data• Can allow for the presence of genotyping errors• Pretty pictures• Higher power in low-density scans• Improved estimate of QTL locationDisadvantages• Greater computational effort• Requires specialized software• More difficult to include covariates?• Only considers one QTL at a timeInterval mapping13Multiple QTL methodsWhy consider multiple QTL at once?z To separate linked QTL. If two QTL are close together on the same chromosome, our one-at-a-time strategy may have problems finding either (e.g. if they work in opposite directions, or interact). Our LOD scores won’t make sense either.z To permit the investigation of interactions. It may be that interactions greatly strengthen our ability to find QTL, though this is not clear. z To reduce residual variation. If QTL exist at loci other than the one we are currently considering, they should be in our model. For if they are not, they will be in the error, and hence reduce our ability to detect the current one. See below.The problemz n backcross subjects; M markers in all, with at most a handful expected to be near QTLxij= genotype (0/1) of mouse i at marker jyi= phenotype (trait value) of mouse iYi= µ+ ∑j=1M ∆jxij+ εj Which∆j ≠0 ?⇒ Variable selection in linear models (regression)14Select class of modelsz Additive modelsz Additive plus pairwise interactionsz Regression treesCompare models (γ)z BICδ(γ) = logRSS(γ)+ γ(δlog n/n)z Sequential permutation testsSearch model spacez Forward selection (FS)z Backward elimination (BE)z FS followed by BEz MCMCAssess performancez Maximize no QTL found;z control false positive rateFinding QTL as model selectionAcknowledgementsMelanie Bahlo, WEHIHongyu Zhao, YaleKarl Broman, Johns HopkinsNusrat Rabbee, UCB15Referenceswww.netspace.org/MendelWebHLK Whitehouse: Towards an Understanding of the Mechanism of Heredity, 3rd ed. Arnold 1973 Kenneth Lange: Mathematical and statistical methods for genetic analysis, Springer 1997Elizabeth A Thompson: Statistical inference from genetic data on pedigrees, CBMS, IMS, 2000.Jurg Ott : Analysis of human genetic linkage, 3rd ednJohns Hopkins University Press 1999JD Terwilliger & J Ott : Handbook of human genetic linkage, Johns Hopkins University Press


View Full Document

CMU CS 10810 - Lecture

Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?