DOC PREVIEW
CMU BSC 03711 - Finding the best tree by heuristic search

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 4 Finding the best tree by heuristic search If we cannot find the best trees by examining all possible trees, we could imagine searching in the space of possible trees. In this chapter we will consider heuristic search techniques, which attempt to find the best trees without looking at all pos-sible trees. They are, of their very nature, a bit ad hoc. They also do not guarantee us to have found all, or even any, of the best trees. The fundamental technique is to take an initial estimate of the tree and make small rearrangements of branches in it, to reach "neighboring" trees. If any of these neighbors are better, we consider them, and continue, attempting more re-arrangements. Finally, we reach a tree that no small rearrangement can improve. Such a tree is at a local optimum in the tree space. However, there is no guarantee that it is a global optimum. Figure 4.1 shows the problem for the case of search-ing in two spatial coordinates. Trees are a rather different case, but tree space is difficult to depict in a diagram like this. In the diagram, we are trying to maximize a quantity -trying to find the high-est point on the surface. In the case of the parsimony criterion, we are actually try-ing to minimize the number of evolutionary changes of state. We can convert that into a maximization problem by simply placing a minus sign before the number of changes of state, so that 272 becomes -272. Or, alternatively, we could subtract it from a large number, so that 272 becomes 10,000 - 272 = 9.728. Maximization of the resulting quantity will minimize the number of changes of state. It is easier to show the diagram as a maximization problem than as a minimization problem, as maxima are more visible than minima. In this diagram, we imagine that we have started with a particular point on the surface and then looked at its four neighbors. One of them is higher, so we move to that point. Then we examine its neighbors. We continue this until we have climbed to the highest point on the "hill." However, as the diagram shows, this 3738 Chapter 4 end up here but global maximum is here If start here Figure 4.1: A surface rising above a two-dimensional plain (or plane). The process of climbing uphill on the surface is illustrated, as well as the failure to find a higher peak by this "greedy" method. strategy is incapable of finding another point, one that is in fact higher, but that is not located on the hill where we started. Strategies of this sort are often called the greedyalgorithmbecause they sieze the first improvement that they see. In this chapter we will examine some of the different kinds of rearrangements that have been proposed. Many others are possible. The techniques are more the result of common sense than of using any mathematical techniques. Later in the chapter we will also discuss some sequential addition strategies used for locating the starting point of the search. In the next chapter we will discuss branch and bound methods, a search technique guaranteed to find all of the most parsimo-nious trees. Although the discussion here will be cast in terms of parsimony, it is important to remember that exactly the same strategies and issues arise with the other criteria for inferring phylogenies, and heuristic search techniques are employed for them in much the same way. Nearest-neighbor interchanges Nearest-neighbor interchanges (NNI) in effect swap two adjacent branches on the tree. A more careful description is that they erase an interior branch on the tree, and the two branches connected to it at each end (so that a total of five branches are erased). This leaves four subtrees disconnected from each other. Four subtreesFinding the best tree by heuristic search 39 Asubtree~ sT u v is rearranged by dissolving the connections to an interior branch ~, fj> :..-----_ .....//'. 4l'\p and reforming them in one of the two possible alternative ways: sT sT u v v Figure 4.2: The process of nearest-neighbor interchange. An interior branch is dissolved and the four subtrees connected to it are isolated. These then can be reconnected in two other ways. can be hooked together into a tree in three possible ways. Figure 4.2 shows the process. One of the three trees is, of course, the original one, so that each nearest-neighbor interchange examines two alternative trees. In an unrooted bifurcating tree with ti tips, there will be n - 3 interior branches, at each of which we can examine two neighboring trees. Thus in all, 2(n - 3) neighbors can be examined for each tree. Thus a tree with 20 tips has 34 neighbors under nearest-neighbor interchange. There is some ambiguity about how greedy we ought to be. If we accept the first neighboring tree that is an improvement, that will not be as good a search40 Chapter 4 AOB E>1<C BOC _ • >1<° A~ A/A E B>1<C I° E A E B./ ~A!Lc C>1<O o'""'E ABO/\ / ~A~B >1< ° C ->'<C E --A>1<B __ A>1<B 0 C CEO E AE C B>1<O\ / - \ CAB , O>1<E OA B E>1<C Figure 4.3: The space of all 15 possible unrooted trees with 5 tips. Neighbors are connected by lines when a nearest-neighbor interchange can convert one into the other. The labels A-E correspond to the species names Alpha through Epsilon in that data set. This symmetric ar-rangement of nodes was discovered by Ben Rudd Schoenberg (per-sonal communication), and we thus denote this graph the Schoenberg graph. method as looking at all 2(77 - 3) neighbors and picking the best one, but it will be quicker. We could also imagine trying multiple trees tied for best and evaluating rearrangements on each of them. The most sophisticated heuristic rearrangement strategies retain a list of all trees tied for best, and rearrange all of them. Figure 4.3 shows what the space of all 15 possible unrooted trees looks like for 5 species, where trees that are adjacent by nearest-neighbor interchange are connected. Figure 4.4 shows the numbers of changes of state that are required for the data in Table 1.1 for each of these trees. Each tree has 4 neighbors. It willFinding the best tree by heuristic search 41 9 10 Figure 4.4: The space of all 15 possible trees, as in Figure 4.3, where the number of changes of state on the data set of Table 1.1 is shown. Nearest-neighbor interchanges search for the most parsimonious tree by moving in this graph. be a useful exercise for the reader to pick a random starting point on this graph, and try various variations on


View Full Document

CMU BSC 03711 - Finding the best tree by heuristic search

Documents in this Course
lecture

lecture

8 pages

Lecture

Lecture

3 pages

Homework

Homework

10 pages

Lecture

Lecture

17 pages

Delsuc05

Delsuc05

15 pages

hmwk1

hmwk1

2 pages

lecture

lecture

6 pages

Lecture

Lecture

10 pages

barnacle4

barnacle4

15 pages

review

review

10 pages

Homework

Homework

10 pages

Midterm

Midterm

12 pages

lecture

lecture

11 pages

lecture

lecture

32 pages

Lecture

Lecture

7 pages

Lecture

Lecture

17 pages

Lecture

Lecture

12 pages

Lecture

Lecture

21 pages

Lecture

Lecture

11 pages

Lecture

Lecture

28 pages

Homework

Homework

13 pages

Logistics

Logistics

11 pages

lecture

lecture

11 pages

Lecture

Lecture

8 pages

Lecture

Lecture

9 pages

lecture

lecture

8 pages

Problem

Problem

6 pages

Homework

Homework

10 pages

Lecture

Lecture

9 pages

Problem

Problem

7 pages

hmwk4

hmwk4

7 pages

Problem

Problem

6 pages

lecture

lecture

16 pages

Problem

Problem

8 pages

Problem

Problem

6 pages

Problem

Problem

13 pages

lecture

lecture

9 pages

Problem

Problem

11 pages

Notes

Notes

7 pages

Lecture

Lecture

7 pages

Lecture

Lecture

10 pages

Lecture

Lecture

9 pages

Homework

Homework

15 pages

Lecture

Lecture

16 pages

Problem

Problem

15 pages

Load more
Download Finding the best tree by heuristic search
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Finding the best tree by heuristic search and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Finding the best tree by heuristic search 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?