DOC PREVIEW
Stanford CS 262 - Phylogeny Tree Reconstruction

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Phylogeny Tree ReconstructionFinal ExamNumber of labeled unrooted tree topologiesSlide 4Slide 5Slide 6Slide 7Slide 8Search through tree topologies: Branch and BoundBootstrapping to get the best treesProbabilistic MethodsSlide 12Computing the Likelihood of a TreeFelsenstein’s Likelihood AlgorithmSlide 15Some new sequencing technologiesMolecular Inversion ProbesSlide 18Single Molecule Array for Genotyping—SolexaNanopore SequencingSlide 21Nanopore Sequencing—AssemblyPyrosequencingPyrosequencing on a chipPyrosequencing SignalPyrosequencing—AssemblyPolony SequencingPhylogeny Tree Reconstruction1432514235Final Exam•24-hour, takehome exam•More straight-forward questions than in homeworks•Please email Michael and Serafim by Friday, with your preference of day to take exam•Exam starts Sunday, …, Thursday noon; ends Monday, ..., Friday noonNumber of labeled unrooted tree topologies•How many possibilities are there for leaf 4?123444Number of labeled unrooted tree topologies•How many possibilities are there for leaf 4?For the 4th leaf, there are 3 possibilities1234Number of labeled unrooted tree topologies•How many possibilities are there for leaf 5?For the 5th leaf, there are 5 possibilities12345Number of labeled unrooted tree topologies•How many possibilities are there for leaf 6?For the 6th leaf, there are 7 possibilities12345Number of labeled unrooted tree topologies•How many possibilities are there for leaf n?For the nth leaf, there are 2n – 5 possibilities12345Number of labeled unrooted tree topologies•#unrooted trees for n taxa: (2n-5)*(2n-7)*...*3*1 = (2n-5)! / [2n-3*(n-3)!]•#rooted trees for n taxa: (2n-3)*(2n-5)*(2n-7)*...*3 = (2n-3)! / [2n-2*(n-2)!]12345N = 10#unrooted: 2,027,025#rooted: 34,459,425N = 30#unrooted: 8.7x1036#rooted: 4.95x1038Search through tree topologies: Branch and BoundObservation: adding an edge to an existing tree can only increase the parsimony costEnumerate all unrooted trees with at most n leaves:[i3][i5][i7]……[i2N–5]]where each ik can take values from 0 (no edge) to kAt each point keep C = smallest cost so far for a complete treeStart B&B with tree [1][0][0]……[0]Whenever cost of current tree T is > C, then:T is not optimalAny tree extending T with more edges is not optimal: Increment by 1 the rightmost nonzero counterBootstrapping to get the best treesMain outline of algorithm1. Select random columns from a multiple alignment – one column can then appear several times2. Build a phylogenetic tree based on the random sample from (1)3. Repeat (1), (2) many (say, 1000) times4. Output the tree that is constructed most frequentlyProbabilistic MethodsA more refined measure of evolution along a tree than parsimonyP(x1, x2, xroot | t1, t2) = P(xroot) P(x1 | t1, xroot) P(x2 | t2, xroot)If we use Jukes-Cantor, for example, and x1 = xroot = A, x2 = C, t1 = t2 = 1,= pA¼(1 + 3e-4α) ¼(1 – e-4α) = (¼)3(1 + 3e-4α)(1 – e-4α) x1t2xroott1x2Probabilistic Methods•If we know all internal labels xu, P(x1, x2, …, xN, xN+1, …, x2N-1 | T, t) = P(xroot)jrootP(xj | xparent(j), tj, parent(j))•Usually we don’t know the internal labels, thereforeP(x1, x2, …, xN | T, t) = xN+1 xN+2 …  x2N-1 P(x1, x2, …, x2N-1 | T, t) xroot = x2N-1x1x2xNxuComputing the Likelihood of a Tree•Define P(Lk | a): probability of subtree rooted at xk, given that xk = a•Then, P(Lk | a) = (b P(Li | b) P(b | a, tki))(c P(Lj | c) P(c | a, tki))xkxixjtkitkjFelsenstein’s Likelihood AlgorithmTo calculate P(x1, x2, …, xN | T, t)Initialization:Set k = 2N – 1Recursion: Compute P(Lk | a) for all a  If k is a leaf node:Set P(Lk | a) = 1(a = xk)If k is not a leaf node:1. Compute P(Li | b), P(Lj | b) for all b, for daughter nodes i, j2. Set P(Lk | a) = b,c P(b | a, tki)P(Li | b) P(c | a, tkj) P(Lj | c)Termination:Likelihood at this column = P(x1, x2, …, xN | T, t) = aP(L2N-1 | a)P(a)Probabilistic MethodsGiven M (ungapped) alignment columns of N sequences, •Define likelihood of a tree:L(T, t) = P(Data | T, t) = m=1…M P(x1m, …, xnm, T, t)Maximum Likelihood Reconstruction:•Given data X = (xij), find a topology T and length vector t that maximize likelihood L(T, t)Some new sequencing technologiesMolecular Inversion ProbesMolecular Inversion ProbesSingle Molecule Array for Genotyping—SolexaNanopore Sequencinghttp://www.mcb.harvard.edu/branton/index.htmNanopore Sequencinghttp://www.mcb.harvard.edu/branton/index.htmNanopore Sequencing—Assembly •Resulting reads are likely to look different than Sanger reads:Long (perhaps 10,000bp-1,000,000bp)High error rate (perhaps 10% – 30%)Two colors?•A/ CTG•AT/ CG•AG/ CT•How can we assemble under such conditions?PyrosequencingPyrosequencing on a chipMostafa Ronaghi, Stanford Genome Technologies Center454 Life SciencesPyrosequencing SignalPyrosequencing—Assembly •Resulting reads are likely to look different than Sanger reads:Short (currently 100 to 200 bp) Low error rates, except in homopolymeric runs (AAA…, CCC…, etc)Currently, not known how to do paired reads on a chip?Polony


View Full Document

Stanford CS 262 - Phylogeny Tree Reconstruction

Documents in this Course
Lecture 8

Lecture 8

38 pages

Lecture 7

Lecture 7

27 pages

Lecture 4

Lecture 4

12 pages

Lecture 1

Lecture 1

11 pages

Biology

Biology

54 pages

Lecture 7

Lecture 7

45 pages

Load more
Download Phylogeny Tree Reconstruction
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Phylogeny Tree Reconstruction and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Phylogeny Tree Reconstruction 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?