U of U PHIL 5192 - Models of Sequence Evolution I
Pages 6

Unformatted text preview:

Models of Sequence Evolution IStochastic error and models ofsequence evolution.! Random error:" Data too sparse to recover true tree" Sequences too short to describe evolutionaryhistory.! Systematic error:" Erroneous model assumptions" Data biased toward the wrong tree" Unrealistic model fails to correct bias.Systematic Error! Sources of bias - possible departures fromsimple model assumptions." Multiples substitutions obscure somechanges" Unequal base (nucleotide) frequencies." Transition to transversion bias." Across site rate heterogeneity.12 possible substitution typesA GC TEach site in thesequence can change inone of 12 ways.Count of all 12 observed changessummed across all 7844 sites.-168846344T206-274781G1286276-481C639752652-ATGCAfromtoSpringer 1999 sequence alignment of 11 mammalsWe simplify to 6 ways fornucleotides to differ between taxa.A GC TThe case of 12changes is nearlymathematicallyintractable. Wesimplify to 6reversible changes.Big modelsimplification butusually works fine.So six changes averaged across thediagonal.-1871066491.5T187-275766.5G1066275-566.5C491.5766.5566.5-ATGCAfromtoJukes Cantor 69: 1 substitution typeA GC T!!!!!!Corrects for multiple substitutions erased by timeJukes Cantor one parameter1-3!!!!!1-3!!!!!1-3!!!!!1-3!TGCAfromtodescribes the probability of change.Jukes Cantor also assumes equalprobabilities of change.0.580.140.140.14T0.140.580.140.14G0.140.140.580.14C0.140.140.140.58ATGCAfromtoJukes Cantor also assumes equalprobabilities of change.0.580.140.140.14T0.140.580.140.14G0.140.140.580.14C0.140.140.140.58ATGCAfromtoRealistic assumption?In this case a serious depature fromJukes Cantor model-0.050.270.13T0.05-0.070.20G0.270.07-0.14C0.130.200.14-ATGCAfromtoKimura80 (K2P) probability matrix(-!"2 #)#!##1(-!"2 #)#!!#1+(-!"2 #)##$!#1-(-!"2 #)TGCAfromtoKimura80: 2 substitution typesA GC T!#!###PurinesPyrimidinestransitionstransversionsHKY85 substitution matrix-!$T"2 #$R#$G!$C#$A#$T-!$A"2 #$Y#$C!$A!$T#$G-!$T"2 #$R#$A#$T!$G#$C- !$G "2 #TGCAHKY85 substitution matrix-!$T"2 #$R#$G!$C#$A#$T-!$A"2 #$Y#$C!$A!$T#$G-!$T"2 #$R#$A#$T!$G#$C- !$G "2 #TGCAHKY85 model addsunequal base (nucleotide frequencies).-!$T"2 #$R#$G!$C#$A#$T-!$A"2 #$Y#$C!$A!$T#$G-!$T"2 #$R#$A#$T!$G#$C- !$G "2 #TGCA$CNucleotide frequencies: $A ! $C ! $G ! $TGTR substitution matrix adds 4more substitution types.-!$T"2 #$R%$G&$C'$A%$T-!$A"2 #$Y($C#$A&$T($G-!$T"2 #$R!$A'$T#$G!$C- !$G "2 #$YTGCAGTR: 6 substitution types and unequal base frequencyA GC T!&)*#+PurinesPyrimidinestransitionstransversionsModels nested within the general timereversible (GTR) model! Equal base frequencies" JC69 1 substitution type (ST)" K80 2 ST (transitions and transversions)! add unequal base frequencies" F81 1 ST" HKY/F84 2 STs" GTR 6 STs (A<->G, A<->T, . . .)Increasingly complex modelsJukes Cantor (JC)Kimura (K80 or K2P)Hasegawa Kishino Yang (HKY) (observed)So different trees for different models! How do we know if more complex models arebetter?" What is our criteria for considering one tree orone method better than another.! Parsimony at least gives us objective criteria fordistinguishing among trees." We settle on the tree(s) that mostparsimoniously describe evolutionary history." But we can study a whole range of alternatives.! Distance methods provide no alternative trees.Maximum likelihood methods! Like parsimony allows evaluation ofalternative close trees.! Like distance allows modeling differentprocesses describing evolutionary history.! But has many other advantages over theparsimony and distance methods.! Includes Bayesian methods which are basedon likelihood models.Recall the Felsenstein zone:long branch attractionACB DACBDTrue tree Inferred treeqFelsenstein zoneFour taxon treewith branchesof lengths p(long) and q(short)ADBCpqqpThe space of four tree shapesAeacBranch length d, e Branch length a, b &cbd ea cbdeacbdeacbBdC DLong branch attractionAeacBranch length d, e Branch length a, b &cbd ea cbdeacbdeacbBdC DFelsenstein zoneCompare methods of tree inference.Efficient methods: shorter sequenceneeded to converge to the true tree.Consistent methods: the tree converges to theright tree as the sequence length increases.NJ and UPGMA: longer sequence needed to converge.for many tree types the probability of convergingto the wrong tree increases as the sequence lengthincreases.UPGMAJukes-Cantor modelNJ: no more efficient than UPGMA.Neighbor joining: More consistent thanUPGMA. The problem area of the tree space issmaller so more likely to get the right tree.NJSomewhat more efficient than NJ orUPGMASimilar in consistency to NJ.parsimonyUPGMA & NJ with Kimura modelMost efficient of all the methods presented here.Highly consistent except in the Felsenstein zone.WeightedparsimonyWeighted ParsimonyWeighted ParsimonyNJ, Kimura modelNJ, Kimura modelMaximum likelihood,Maximum likelihood,KimuraKimura(Corneli addition)(Corneli addition)We will see that maximum likelihoodperforms the best in almost every case.•• Compared to others ML Compared to others MLmethods aremethods are•• Highly Highly consistentconsistent•• Most Most efficientefficient•• And importantly mostAnd importantly mostrobust robust to deviationsto deviationsfrom modelfrom modelassumptions.assumptions.•• This is true for all This is true for allstatistical methods notstatistical methods notjust just phylogeneticsphylogeneticsNext week! Maximum likelihood methods." More on models" Criteria for deciding among models." Criteria for deciding among


View Full Document

U of U PHIL 5192 - Models of Sequence Evolution I

Course: Phil 5192-
Pages: 6
Download Models of Sequence Evolution I
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Models of Sequence Evolution I and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Models of Sequence Evolution I 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?