DOC PREVIEW
Stanford CS 262 - Phylogeny Tree Reconstruction

This preview shows page 1-2-3-19-20-38-39-40 out of 40 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Phylogeny Tree ReconstructionInferring PhylogeniesModeling EvolutionSlide 4Slide 5Slide 6Phylogeny and sequence comparisonDistance between two sequencesA simple clustering method for building treeAlgorithm: Average LinkageExampleUltrametric Distances and Molecular ClockUltrametric Distances & Average LinkageWeakness of Average LinkageAdditive DistancesSlide 16Reconstructing Additive Distances Given TSlide 18Slide 19Slide 20Neighbor-JoiningAlgorithm: Neighbor-joiningParsimonyParsimony ScoringSlide 25Slide 26Traceback to find ancestral nucleotidesSlide 28Number of labeled unrooted tree topologiesSlide 30Slide 31Slide 32Slide 33Slide 34Search through tree topologies: Branch and BoundBootstrapping to get the best treesProbabilistic MethodsSlide 38Felsenstein’s Likelihood AlgorithmSlide 40Phylogeny Tree Reconstruction1432514235Inferring PhylogeniesTrees can be inferred by several criteria:Morphology of the organismsSequence comparisonExample:Orc: ACAGTGACGCCCCAAACGTElf: ACAGTGACGCTACAAACGTDwarf: CCTGTGACGTAACAAACGAHobbit: CCTGTGACGTAGCAAACGAHuman: CCTGTGACGTAGCAAACGAModeling EvolutionDuring infinitesimal time t, there is not enough time for two substitutions to happen on the same nucleotideSo we can estimate P(x | y, t), for x, y  {A, C, G, T}Then letP(A|A, t) …… P(A|T, t)S(t) = … …… …P(T|A, t) …… P(T|T, t)xxytModeling EvolutionReasonable assumption: multiplicative (implying a stationary Markov process)S(t+t’) = S(t)S(t’)That is, P(x | y, t+t’) = z P(x | z, t) P(z | y, t’)Jukes-Cantor: constant rate of evolution 1 - 3   For short time , S() = I+R =  1 - 3     1 - 3     1 - 3A CGTModeling EvolutionJukes-Cantor:For longer times,r(t) s(t) s(t) s(t)S(t) = s(t) r(t) s(t) s(t)s(t) s(t) r(t) s(t)s(t) s(t) s(t) r(t)Where we can derive:r(t) = ¼ (1 + 3 e-4t)s(t) = ¼ (1 – e-4t)S(t+) = S(t)S() = S(t)(I + R)Therefore,(S(t+) – S(t))/ = S(t) RAt the limit of   0,S’(t) = S(t) REquivalently,r’ = -3r + 3ss’ = -s + rThose diff. equations lead to:r(t) = ¼ (1 + 3 e-4t)s(t) = ¼ (1 – e-4t)Modeling EvolutionKimura:Transitions: A/G, C/TTransversions: A/T, A/C, G/T, C/GTransitions (rate ) are much more likely than transversions (rate )r(t) s(t) u(t) s(t)S(t) = s(t) r(t) s(t) u(t)u(t) s(t) r(t) s(t)s(t) u(t) s(t) r(t)Where s(t) = ¼ (1 – e-4t)u(t) = ¼ (1 + e-4t – e-2(+)t)r(t) = 1 – 2s(t) – u(t)Phylogeny and sequence comparisonBasic principles:•Degree of sequence difference is proportional to length of independent sequence evolution•Only use positions where alignment is pretty certain – avoid areas with (too many) gapsDistance between two sequencesGiven sequences xi, xj,Define dij = distance between the two sequencesOne possible definition:dij = fraction f of sites u where xi[u]  xj[u]Better model (Jukes-Cantor):f = ¾ (1 – e-4t) ¾ e-4t = ¾ - f  log (e-4t) = log (1 – 4/3 f)dij = t = - ¼ -1 log(1 – 4/3 f)A simple clustering method for building treeUPGMA (unweighted pair group method using arithmetic averages)Or the Average Linkage MethodGiven two disjoint clusters Ci, Cj of sequences, 1dij = ––––––––– {p Ci, q Cj}dpq|Ci|  |Cj|Claim that if Ck = Ci  Cj, then distance to another cluster Cl is:dil |Ci| + djl |Cj| dkl = –––––––––––––– |Ci| + |Cj|Proof Ci,Cl dpq + Cj,Cl dpq dkl = –––––––––––––––– (|Ci| + |Cj|) |Cl| |Ci|/(|Ci||Cl|) Ci,Cl dpq + |Cj|/(|Cj||Cl|) Cj,Cl dpq = –––––––––––––––––––––––––––––––––––– (|Ci| + |Cj|) |Ci| dil + |Cj| djl= ––––––––––––– (|Ci| + |Cj|)Algorithm: Average LinkageInitialization:Assign each xi into its own cluster CiDefine one leaf per sequence, height 0Iteration:Find two clusters Ci, Cj s.t. dij is minLet Ck = Ci  CjDefine node connecting Ci, Cj, & place it at height dij/2Delete Ci, CjTermination:When two clusters i, j remain, place root at height dij/21432514235Examplev w x y zv0 6 8 8 8w0 8 8 8x0 4 4y0 2z0y zxwv1234v w x yzv0 6 8 8w0 8 8x0 4yz0v w xyzv0 6 8w0 8xyz0vw xyzvw0 8xyz0Ultrametric Distances and Molecular ClockDefinition:A distance function d(.,.) is ultrametric if for any three distances dij  dik  dij, it is true that dij  dik = dijThe Molecular Clock:The evolutionary distance between species x and y is 2 the Earth time to reach the nearest common ancestorThat is, the molecular clock has constant rate in all species1 4 2 3 5yearsThe molecular clock results in ultrametric distancesUltrametric Distances & Average LinkageAverage Linkage is guaranteed to reconstruct correctly a binary tree with ultrametric distancesProof: Exercise (extra credit)1 4 2 35Weakness of Average LinkageMolecular clock: all species evolve at the same rate (Earth time)However, certain species (e.g., mouse, rat) evolve much fasterExample where UPGMA messes up:23411432Correct treeAL treeAdditive DistancesGiven a tree, a distance measure is additive if the distance between any pair of leaves is the sum of lengths of edges connecting themGiven a tree T & additive distances dij, can uniquely reconstruct edge lengths:•Find two neighboring leaves i, j, with common parent k•Place parent node k at distance dkm = ½ (dim + djm – dij) from any node m12345678910121113d1,4Additive DistancesFor any four leaves x, y, z, w, consider the three sums d(x, y) + d(z, w)d(x, z) + d(y, w)d(x, w) + d(y, z)One of them is smaller than the other two, which are equald(x, y) + d(z, w) < d(x, z) + d(y, w) = d(x, w) + d(y, z)xyzwReconstructing Additive Distances Given Txyzwv5473346v w x y zv0 10 17 16 16w0 15 14 14x0 9 15y0 14z0TIf we know T and D, but do not know the length of each leaf, we can reconstruct those lengthsDReconstructing Additive Distances Given Txyzwvv w x y zv0 10 17 16 16w0 15 14 14x0 9 15y0 14z0TDReconstructing Additive Distances Given Txyzwvv w x y zv0 10 17 16 16w0 15 14 14x0 9 15y0 14z0TDa x y za0 11 10 10x0 9 15y0 14z0aD1dax = ½ (dvx + dwx – dvw)day = ½ (dvy + dwy – dvw)daz = ½ (dvz + dwz – dvw)Reconstructing Additive Distances Given TxyzwvTa x y za0 11 10 10x0 9 15y0 14z0aD1a b za0 6 10b0 10z0D2bca ca0 3c0D3d(a, c) = 3d(b, c) =


View Full Document

Stanford CS 262 - Phylogeny Tree Reconstruction

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Phylogeny Tree Reconstruction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Phylogeny Tree Reconstruction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Phylogeny Tree Reconstruction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?