DOC PREVIEW
Berkeley INTEGBI 200A - Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Integrative Biology 200A “PRINCIPLES OF PHYLOGENETICS” Spring 2008 University of California, Berkeley Kipling Will- 10 Apr Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures I. Overview. -- The truest tests involve empirical tests that examine all critical evidence. For phylogenetic hypotheses (tree, branching pattern, branch lengths, character state distribution), this involves the addition of more characters and taxa. This is not always reasonable/feasible and when do we have enough anyway? This is an issue of philosophical or statistical confidence. -- The simplest form of confidence, which is somewhat subjective, is to show character state changes on the cladogram. Groups supported by more, less homoplastic and more complex character state changes are thought to be well supported. When more of our initial statements of homology survive and are compatible we have increased confidence in the hypothesis. Alas, this may be suitable for morphological data, however, it is very difficult to apply given the simplicity of DNA sequence. -- A general or specific “fit” to external data (e.g. biogeographic patterns) also builds confidence. However, it is generally more narrative and subjective, making it hard to evaluate if you are not actively working within the system. -- It is necessary to express some sort of confidence or make a statement of reliability in order to give others a sense of how well your data fit your hypothesis and to what degree the critical evidence refutes competing hypotheses even if we are confident in the result. -- Many exploration methods seek some sort of statistical reliability or measure to give a notion of how bold or conservative we should be in regard to conclusions based on the phylogenetic pattern. The fact that a nearby sub-optimal solutions exist is not enough to cause us to move from one hypothesis to another. -- Although there is a general notion that we are identifying well supported clades, exploration methods and support measures are really just as (more?) important for pointing to poorly supported parts of the tree. Poorly supported groups suggest where future efforts need to be applied. -- Most statistical methods require some assumption of a universe from which the sample is drawn. Generally this is random sample of the universe of possible independent entities, i.e. they are independent and identically distributed (i.i.d). II. Sensitivity and Resampling Analyses: Various heuristic methods explore how robust the hypothesis is likely to be if the underlying assumptions are wrong or expressed as some sort of “support”. A. Assumption sensitivity analyses: HOW: Assumptions (= parameters) are varied in multiple analyses and the results compared in some way. WHY: To look for (in)sensitivity to variation in model assumption (e.g. weights assigned to transitions/transversions changes topology). This has been used as an optimality criterion for deciding if a group should be accepted or rejected. Groups sensitive to variation are rejected. Also used as a means to select a set of alignment parameters. WHAT IT TELLS US: Not truly a test of monophyly or support. Monophyly is tested in the corroboration of empirical evidence in light of some set of “valid” assumption. It doesn’t really test the support offered by the data. It does show which groups remain under a set of “reasonable” parameters and support is drawn from a variety of synapomorphy classes. Almost any topology can be supported under some set of parameters. Can’t distinguish levels or different kinds of support and a group well supported under “mutation-parsimony” may be lacking under “in/del-parsimony”………….. So what? ((a,b) [10,10,10] -- ((c,d) [1,1,1] -- (e,f) [30,0,0] )) B. Bremer Support / Decay analyses or “index”[not really a mathematical index] HOW: Record the number of extra steps required to loose a clade that is found in the most parsimonious tree. Any clade not found in the strict consensus of all MPTs has a Bremer support value of 0. Any clade not found in the strictconsensus of all trees one step longer than the MTPs has a Bremer support value of 1,2,3... until a shortest tree that does not contain any clade is found. In reality this includes too many trees and a Bremer value is an estimate based on heuristic searches of suboptimal tree space. WHY: To give a measure of decisiveness and indicate ambiguously supported nodes directly from the data. WHAT IT TELLS US: An estimate of the degree to which the optimal solution is preferred to alternatives. As a heuristic it points to poorly supported groups that may have few synapomorphies or may be supported by conflicting characters. However, it does not discriminate between different types of support and does not have a clear statistical interpretation. Matrix w/100 characters and MPTs of 200 steps, each character optimizes for 2 steps. Trees 2 steps longer (202 steps) could come by increasing one character to 4 steps [99*2 +1*4= 202] OR reducing 49 characters to 1 step and increasing 51 to 3 steps [49*1 + 51*3= 202]. [Technical note: Paup doesn’t calculate Bremer support directly, so use MacClade to make a command file for Paup to read, this will generate Bremer numbers based on a set of constraint clade analyses (see MacClade manual). You can also use the program TreeRot with Paup. Bremer support can be directly calculated by Nona, however, it is VERY dependent on the search parameters and memory limitations. I suggest Paup for this one. As a rule of thumb, a score of 3 is good and 5 highly “supported”. I have not done this with TNT. If you do let me know how it goes.] C. Methodological concordance: HOW: Multiple methods of phylogenetic analysis are used and the clades found in common are presumed well supported. WHY: Controversy over methods and assumption can be avoided by a pluralistic approach that leads to reasonable results. Accurate methods will converge on the “truth” and a lack of agreement between methods indicates that none are recovering the true tree (Kim 1993). WHAT IT TELLS US: Which clades are not affected by the assumptions and philosophical underpinnings of the methods that were used for analysis of the data. Since various methods address the problem from very different statistical and philosophical views, the fact that they converge may say something about the data or the methods but may have little to


View Full Document

Berkeley INTEGBI 200A - Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures

Documents in this Course
Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

4 pages

Quiz 1

Quiz 1

5 pages

Quiz 2

Quiz 2

4 pages

Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

2 pages

Notes

Notes

3 pages

Quiz 2

Quiz 2

3 pages

Load more
Download Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Phylogenetic tree IV- Data/Hypothesis Exploration and Support Measures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?