Stanford CS 374 - Of Mice and Men—learning from genome reversal finding - D2375802

Home> Schools> Stanford University> Computer Science (CS) > CS 374> Of Mice and Men—learning from genome reversal finding

DOC PREVIEW

Stanford CS 374 - Of Mice and Men—learning from genome reversal finding

School name Stanford University

Course Cs 374- Algorithms in Biology

Pages 10

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 374: Algorithms in BiologyLecture 11: Of Mice and Men—learning from genome reversal findingsNov. 2nd, Presented by Dan Woods, Subscribed by Yu BaiBased on following papers:1. Pevzner P, Tesler G. “Genome rearrangements in mammalian evolution: lessonsfrom human and mouse genomes” (2003), Genome Research, 13, 37-45.2. Pevzner P, Tesler G. “Transforming Men to Mice: the Nadeau-Taylor ChromosomalBreakage Model Revisited” (2003) RECOMB’03, April 10-13Additional references: 3. V. Bafna and P. A. Pevzner – Genome rearrangements and sorting by reversals. 4. S. Hannenhalli and P. A. Pevzner – Transforming cabbage into turnip (polynomialalgorithm for sorting signed permutations by reversals) (this paper referenced in text ofTransforming Men into Mice for definitions of hurdles and fortresses)5. J. D. Kececioglu and D. Sankoff – Exact and approximate algorithms for sorting byreversals with application to genome rearrangement 1Outline The lecture reviewed the importance of studying the genome rearrangements forunderstanding the existing varieties of genomic architectures. In particular, finding thescenarios in genome rearrangement has shed light on previously unknown features inevolutions of mammalian. Following an introduction to the common concepts andterminologies used in this field, the key theorem for calculating the reversal distance wasderived. The new methodology for presenting reversal information based on this theoremwas presented. Implementation of this method to derive human genomic sequences frommouse genomic sequences, as well as the new sight it brought into the understanding ofchromosomal breakages, were discussed. 2Background2.1 Motivation for studying genome rearrangementSequence comparison in molecular biology is in the beginning of a major paradigmshift—from gene comparison based on local mutations to chromosome comparison basedon genome rearrangement. Because the rearrangement events are much less frequent thanpoint mutations, by discovering what rearrangement events have occurred, and what wastheir order of occurrence, there is a chance to get a better understanding of theevolutionary process. Therefore, the ultimate goal of gene rearrangement studies isfinding a series of (most parsimonious) rearrangement scenarios to transform one genomeinto another.2.2 Basic terms in genome rearrangement studies:The types of rearrangements: Recalled from earlier lectures, there are inversions,translocations or duplications, and their combinations. An inversion (also called a reversal)occurs when a particular sequence segment changes direction, a translocation occurs when asegment moves to another position and a duplication occurs when a segment is duplicated. Micro- and macro rearrangement:Micro-rearrangement is an intra-chromosomal rearrangement and the size is normally less than 1Mb. Macro-rearrangement refers to intra- or inter-chromosomal rearrangements with larger spans.Synteny block: It is the region in which the same gene order is observed betweenorthologs. Note orthologs are defined as the corresponding genes in two different species.The synteny blocks do not necessarily represent areas of continuous similarity betweentwo genomes. Instead, they usually consist of short regions of similarity that may beinterrupted by dissimilar regions and gaps. Most synteny blocks can be converted intoconserved segments by microrearrangements.Reversals and Breakpoints: A reversal operation ρ=[ i , j ], of a permutation α is defined as following:ar(k) = {a ( i + j - k ) if i < k < j, a(k) otherwise} where i,j are the edges of the reversal. A breakpoint of permutation a with respect to β is a place where a pair of neighbouring sequence segments x, y in a no longer neighbour in β.In general, a reversal introduces two breakpoints. 2.3 Calculate the reversal distanceThe reversal distances is defined as the minimum number of reversals to transform onegene permutation (most unichromosomal genomes where the inversions are common) into the other. Itrepresents what the parsimonious rearrangements and in what order they occurredbetween two genomes. Therefore gene rearrangement problem is essentially a problem offinding the reversal distance, dß(α), from permutation α to permutation ß. Pioneer work from Nadeau and Taylor introduced the notion of breakpoints and formulated acorrelation between reversal distance dß(α) and the number of breakpoints (one reversalcorresponds to two breakpoints (br)). i.e. dß(α)>= br(α)/2 if some of the breakpoints arereused. More recently, a new theorem proposed by Hannenhalli and Pevzner, is able toexpresses the genomic distance in terms of easily computable parameters reflectingdifferent combinatorial properties of sets of strings, and leads to a polynomial timealgorithm for computing most parsimonious rearrangement scenarios. The lecture introduced the Hannenhalli-Pevzner theorem as following:i) Given a permutation α, a breakpoint graph, also and better called (by Setubal and Meidanis in their excellent textbook) the reality and desire diagram (RD(α)) is constructed. The diagram will have one vertex for each element of the permutation (the original if circular, the augmented version if linear, see Fig.1a & 1b) and will have two kinds of edges: reality edges, the current adjacencies, and desire edges, the adjacencies in the identity permutation. (a) (b) Figure1: Linear (a) and circular (b) version of reality and desired diagram. Note the structure of the diagram in Fig. 1b can be decomposed into 3-cycles. Bafna and Pevzner (1996) proved that a reversal can add at most one cycle; since the identity permutation has n + 1 cycles (a permutation with n synteny blocks), this immediately gives us a new lower bound on the inversion distance: it must be at least as large as the difference in the number of cycles, that is as large as n + 1 minus the number of cycles in a maximum decomposition of the diagram into edge-disjoint alternating cycles, i.e. dß(α)>=n + 1-c(α). In contrast to the breakpoint bound, this lower bound is very tight when used with biological data, it is often exact and rarely more than 1 off.ii) We can restate the goal of transforming one permutation to the other as performing theproper reversals which lead to

View Full Document