DOC PREVIEW
Berkeley INTEGBI 200A - Molecular Clocks and Tree Dating (r8s, BEAST)

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Integrative Biology 200A University of California, Berkeley Principals of Phylogenetics Spring 2010 Updated by Michael Landis Lab 12: Molecular Clocks and Tree Dating (r8s, BEAST) Prep: • Download and install r8s (Mac only), PAML, and BEAST (Mac or PC) • For the PAUP part of the lab, work with someone who has PAUP installed. • Note for next time: the LR test parts of this lab could probably be done with APE in R Turn in (email, emailing as a group is fine also, just put your names on it): • The excel spreadsheet with the LR tests for the molecular clock. • A graphic of a time-calibrated tree from BEAST (optional, depends on time) Today we are going to use several different methods of testing the molecular clock and estimating node times. We will use a couple of likelihood ratio tests to test the molecular clock against a totally unconstrained tree and a tree with a few branches allowed to vary independently. We will also use several rate smoothing methods to infer divergence times. We will not deal with several commonly used methods. In particular we will not use any relative rate tests to test the molecular clock. This is a very active field and there are constantly new methods and new programs being developed. Testing for Global Molecular Clock Under the null hypothesis, the phylogeny is rooted and the branch lengths are constrained such that all of the tips can be drawn at a single time plane. Under the alternative hypothesis, each branch is allowed to vary independently. The alternative hypothesis invokes s - 2 additional parameters, where s is the number of sequences. The likelihood ratio test statistic is -2logL = 2(logL0 - logL1), where L0 and L1 are the likelihoods under the null and alternative hypotheses, respectively. The significance of the likelihood ratio test statistic can be approximated using a chi-square distribution (with s - 2 degrees of freedom). The following example shows how to perform the likelihood ratio test of the molecular clock using PAUP*. 1. Execute the file Cephalopod.nex (available on the IB 200A website). 2. You do NOT want to deroot your tree if given the option. The molecular clock assumption requires a rooted tree. This file contains molecular data, and it also contains one tree. For this exercise, we have accepted this tree as our working phylogenetic hypothesis and we are now going to test whether it obeys a molecular clock. You can look at the trees if you want using “showtrees.” First, we will calculate the likelihood of this tree without enforcing a molecular clock. For speed, we’ll use the Hasegawa, Kishino, and Yano (1985) (HKY85) model of DNA substitution with among site rate variation described using a gamma distribution. In PAUP, this model is set the variant=HKY under the likelihood settings (lset). 3. Estimate model parameters for the Ts:tv ratio and the gamma distribution shape parameter, use these commands: lset tratio=estimate variant=HKY shape=estimate clock=no;lscores; 4. Record the –lnL score, which we’ll call lnLA. This is the likelihood score for the alternative hypothesis, which allows branches to vary independently. 5. Now, we will change the likelihood settings to enforce a molecular clock: lset tratio=estimate variant=HKY shape=estimate clock=yes; 6. Recalculate the likelihood score under this null model: lscores; 7. Record the –lnL score, which we’ll call lnL0. This is the likelihood score for the null hypothesis, under which characters evolve under a molecular clock (a single rate). Conduct a likelihood ratio test in Excel to determine if you can reject the null model. As you know, the likelihood ratio test compares a simple model to a more complex one, to see if adding the extra parameters offers a significant improvement to the model. This is necessary since adding parameters will always improve the model, at least a little bit. Since a molecular clock only allows a single rate, it can be considered a simpler version of the HKY85. In testing a molecular clock, the degrees of freedom are the Number of taxa - 2 (Felsenstein 1981). 8. Open an Excel file. 9. The likelihood ratio (LR) can be calculated as LR = 2 (-lnLA – -lnL0) (I believe this is because subtracting natural logs is the same as dividing… ) 10. The degrees of freedom (DF) can be calculated as: DF = number of taxa – 2 The cephalopod matrix has 15 taxa, so there are 13 degrees of freedom. 11. Use the chidist function in Excel to get a p-value: =chidist(LR,DF) If the p-values is less than 0.05, you can reject the simpler model (in this case, the global molecular clock.) The null hypothesis, that the rate of evolution is homogeneous among all branches in the phylogeny, is rejected. Rates of substitution significantly vary among branches and a molecular clock is inappropriate. Why is the likelihood score of the alternative model higher than the null model? Testing for a Local Molecular Clock In the previous example we tested whether the entire tree fit a clock as opposed to every branch on the tree having an independent rate. We could also test whether a clade has a different rate from the rest of the tree. We can not do this in PAUP*, because PAUP* does not allow us to specify different rates on different branches. Instead we will use BASEML, a program from the PAML package of phylogeny programs by Ziheng Yang. This program does ML analysis of DNA sequences, and allows us to specify a tree and different distributions of rates on the tree. All these programs can be found at http://abacus.gene.ucl.ac.uk/software/paml.html. This program is entirely controlled by the input files. You will need to download these from the web. 12. Go to the syllabus page of the IB 200A website. Download three files: CephTree.trees, BaseML.ctl, CephSeq.nucThe first file is CephTree.trees – open it with a text editor. As you can see this tree contains the same tree in Newick format as we used in the previous example. You will also see a ‘$1’ after the clade containing Joubiniteuthis and Moroteuthis. This specifies that all the branches in this clade will have a different rate than the other branches in the tree. Open the file BaseML.ctl with a text editor. This is the control file for the BaseML program. When BASEML.exe is run, it automatically opens the control file, which must be in the same folder as it. The first line of the file specifies the


View Full Document

Berkeley INTEGBI 200A - Molecular Clocks and Tree Dating (r8s, BEAST)

Documents in this Course
Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

4 pages

Quiz 1

Quiz 1

5 pages

Quiz 2

Quiz 2

4 pages

Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

2 pages

Notes

Notes

3 pages

Quiz 2

Quiz 2

3 pages

Load more
Download Molecular Clocks and Tree Dating (r8s, BEAST)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Molecular Clocks and Tree Dating (r8s, BEAST) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Molecular Clocks and Tree Dating (r8s, BEAST) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?