DOC PREVIEW
Berkeley INTEGBI 200A - Parsimony tree estimation with TNT

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Integrative Biology 200A University of California, Berkeley PRINCIPLES OF PHYLOGENETICS Spring 2010 Revised by Nick Matzke Lab 4 Parsimony tree estimation with TNT TNT (Tree analysis using New Technology; Goloboff, Farris and Nixon 2000) [http://www.cladistics.com] is a program available for Windows, MacOS or Linux. It has very efficient tree-searching algorithms for large data sets of 300 to 500 tax a. Parsimony is the only available optimality criterion. It implements many new heuristic search methods, such as the ratchet and sectorial searches. It can also be used for tree manipulation and diagnosis. As it is optimized for large matrices it is probably not the best program to use for data sets with fewer taxa. Setup: Download and install TNT (google e.g. “TNT cladistics” to find it on the web) The Parsimony Ratchet Most real data matrices have too many taxa (i.e. more than about 25 taxa) to be analyzed by exact methods therefore a search for the most parsimonious trees must be conducted. In many cases the shortest trees—or more precisely the trees that we think are the shortest—are easily located. In other instances the shortest trees are difficult to locate. It is not possible to predict, from the matrix, the ease in which the shortest trees will be found (if they ever will), or ascertain that one has found the shortest trees. The only criterion that can be used is reproducibility: if numerous searches, with different search parameters, of the matrix produce the same result(s) then one must assume that the shortest trees—or at least the shortest trees that will ever be found—have been located. Effective search parameter must be determined empirically. In a “conventional” search a Wagner tree (or some other starting tree) is calculated and then a branch swapping algorithm (of some kind) is applied to the tree. Usually multiple starting points are utilized to minimize the possibility of becoming stuck in local optimal (or sub–optimal) portions of “tree space”. The search stops when all the trees retained in memory have been swapped and no shorter trees have been found. Given a finite amount of time, the best way to maximize the exploration of tree space is to limit the number of trees retained during branch swapping. In most cases only the shortest trees found during the first phase are swapped, but in some cases some percentage of the shortest length trees are swapped. Nixon (1999) proposed a new tree search method called the parsimony Ratchet (Nixon 1999). The ratchet can be viewed as the application of a Markov Chain to tree search. The ratchet procedure starts by searching for the best tree. Then it resamples the data with replacement or jacknifes it and randomly constrains some nodes. It searches tree space again with the newly defined parameters. It returns to the original settings and repeats the whole process multiple times. By reweighting the characters the ratchet produces a more radical searchof tree space, which is still constrained by the data. TNT TNT can do a number of different heuristic searches in addition to the standard ones included in most phylogeny packages. The more advanced searches are included under new technology searches, and can be used alone or in combination. Sectorial- Explores rearrangements of local clades while leaving the rest of the tree unmodified. It does this successively for different clades chosen at random. Ratchet- The same as described above Drifting- Like the ratchet it alternates between normal searches and more liberal ones. Instead of reweighting the characters during the liberal searches it accepts new trees based on the fit between the new tree and those already in memory. Tree Fusion- Mixes trees that are already in memory making new synergistic trees. If the trees come from different searches then the scores can be improved quickly and drastically. Setup: 1. Using your skills at google and innate intelligence, find the TNT website and download and unzip TNT. a. Macs: download either “Mac32” or “Mac64” (no limit to number of taxa) – either should probably work. b. Windows: download “Win (no taxon limit, bin only)” – I think. There is also a menu-based Windows version which you can play with if you like. 2. Put the unzipped TNT folder in a place you will find it. a. TNT is one of those programs where it easiest if you keep the program and the data file inputs and outputs in the same directory. 3. Google to find the TNT Wiki, and keep that page open for future reference Some wisdom on using specialist scientific applications: Often, when you are trying to use some new random program some scientists have written to do some analysis, complications arise. It is good to have in mind what the typical challenges are, and in what order they need to be dealt with: 1. First you have to find/download/install the program. Hopefully the program has an executable or zipfile appropriate for your system, which you can just download and install with minimal trouble. Good programs will have excecutables available for Windows, Mac 10.4, 10.5, etc. • But sometimes, all you will have is raw code which you have to compile yourself. Sometimes, compiling is easy, sometimes hard, sometimes impossible. Note that scientists are not professional software developers, and do not have software design teams, error checkers, etc. Often they don’t even have much formal training in programming!2. Programs with nice-ish menus, buttons to click, etc., are rare. Users who are beginning to use programs like the menus, etc., since they are easier to figure out initially. However, in the long run, menu-based programs aren’t useful for tasks that are more complex than a once-off analysis. Much serious phylogenetics work requires processing a large collection of trees, or conducting the same analysis while varying a bunch of different options, e.g. to assess how confident you are that your result is independent of the specific choices you made. Command-line programs are (a) much easier for scientists to write, and (b) are much easier for users to automate through scripts. Thus we will learn some basic scripting throughout this course. But basically, scripting is just like typing commands into a command-line, but instead saving all those commands to a text file, and having the computer run the commands for you. 3. But we are


View Full Document

Berkeley INTEGBI 200A - Parsimony tree estimation with TNT

Documents in this Course
Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

4 pages

Quiz 1

Quiz 1

5 pages

Quiz 2

Quiz 2

4 pages

Quiz 1

Quiz 1

2 pages

Quiz 1

Quiz 1

2 pages

Notes

Notes

3 pages

Quiz 2

Quiz 2

3 pages

Load more
Download Parsimony tree estimation with TNT
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parsimony tree estimation with TNT and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parsimony tree estimation with TNT 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?