Molecular EvolutionFeaturesModelsProbabilistic frameworkContinuous-time Markov ChainsMolecular EvolutionBret LargetDepartments of Botany and of StatisticsUniversity of Wisconsin—MadisonSeptember 15, 2011Molecular Evolution 1 / 13Features of Molecular Evolution1Possible multiple changes on edges2Transition/transversion bias3Non-uniform base composition4Rate variation across sites5Dependence among sites6Codon position7Protein structureMolecular Evolution Molecular Evolution Features 2 / 13A Famous Quote About ModelsEssentially, all models are wrong, but some are useful.George BoxMolecular Evolution Molecular Evolution Models 3 / 13Probability ModelsA probabilistic framework provides a platform for formal statisticalinferenceExamining goodness of fit can lead to model refinement and a betterunderstanding of the actual biological processModel refinement is a continuing area of researchMost common models of molecular evolution treat sites asindependentThese common models just need to describe the substitutions amongfour bases at a single site over time.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 4 / 13The Markov PropertyUse the notation X (t) to represent the base at time t.Formal statement:P {X (s + t) = j | X (s) = i, X (u) = x(u) for u < s}= P {X (s + t) = j | X (s) = i}Informal understanding: given the present, the past is independent ofthe futureIf the expression does not depend on the time s, the Markov processis called homogeneous.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 5 / 13Rate MatrixPositive off-diagonal rates of transitionNegative total on the diagonalRow sums are zeroExampleQ = {qij} =−1.1 0.3 0.6 0.20.2 −1.1 0.3 0.60.4 0.3 −0.9 0.20.2 0.9 0.3 −1.4Molecular Evolution Probabilistic framework Continuous-time Markov Chains 6 / 13Alarm Clock DescriptionIf the current state is i, the time to the next event is exponentiallydistributed with rate −qiidefined to be qi.Given a transition occurs from state i, the probability that thetransition is to state j is proportional to qij, namely qij/Pk6=iqik.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 7 / 13Transition ProbabilitiesFor a continuous time Markov chain, the transition matrix whose ijelement is the probability of being in state j at time t given theprocess begins in state i at time 0 is P(t) = eQt.A probability transition matrix has non-negative values and each rowsums to one.Each row contains the probabilities from a probability distribution onthe possible states of the Markov process.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 8 / 13ExamplesP(0.1) [email protected] 0.029 0.055 0.0190.019 0.899 0.029 0.0530.037 0.029 0.916 0.0190.019 0.080 0.029 0.8721CCAP(0.5) [email protected] 0.118 0.199 0.0790.079 0.629 0.118 0.1740.132 0.118 0.671 0.0790.079 0.261 0.118 0.5421CCAP(1) [email protected] 0.190 0.276 0.1260.126 0.464 0.190 0.2190.184 0.190 0.500 0.1260.126 0.329 0.190 0.3551CCAP(10) [email protected] 0.300 0.300 0.2000.200 0.300 0.300 0.2000.200 0.300 0.300 0.2000.200 0.300 0.300 0.2001CCAMolecular Evolution Probabilistic framework Continuous-time Markov Chains 9 / 13The Stationary DistributionWell behaved continuous-time Markov chains have a stationarydistribution, often designated π (not the constant close to 3.14related to circles).When the time t is large enough, the probability Pij(t) will be close toπjfor each i. (See P(10) from earlier.)The stationary distribution can be thought of as a long-run average—over a long time, the proportion of time the state spends in state iconverges to πi.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 10 / 13ParameterizationThe matrix Q = {qij} is typically parameterized as qij= rijπj/µ fori 6= j which guarantees that π will be the stationary distribution whenrij= rji.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 11 / 13ScalingThe expected number of substitutions per unit time is the averagerate of substitution which is a weighted average of the rates for eachstate weighted by their stationary distribution.µ =XiπiqiIf the matrix Q is reparameterized so that all elements are divided byµ, then the unit of measurement becomes one substitution.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 12 / 13Time-reversibilityThe matrix Q is the matrix for a time-reversible Markov chain whenπiqij= πjqjifor all i and j. That is the overall rate of substitutionsfrom i to j equals the overall rate of substitutions from j to i forevery pair of states i and j.Molecular Evolution Probabilistic framework Continuous-time Markov Chains 13 /
View Full Document