DOC PREVIEW
Stanford CS 374 - Evolution’s cauldron - Duplication, deletion, and rearrangement in the mouse and human genomes

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Evolution’s cauldron: Duplication, deletion, andrearrangement in the mouse and human genomesW. James Kent*†, Robert Baertsch*, Angie Hinrichs*, Webb Miller‡, and David Haussler§*Center for Biomolecular Science and Engineering and§Howard Hughes Medical Institute, Department of Computer Science, University of California,Santa Cruz, CA 95064; and‡Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA 16802Edited by Michael S. Waterman, University of Southern California, Los Angeles, CA, and approved July 11, 2003 (received for review April 9, 2003)This study examines genomic duplications, deletions, and rear-rangements that have happened at scales ranging from a singlebase to complete chromosomes by comparing the mouse andhuman genomes. From whole-genome sequence alignments, 344large (>100-kb) blocks of conserved synteny are evident, but theseare further fragmented by smaller-scale evolutionary events. Ex-cluding transposon insertions, on average in each megabase ofgenomic alignment we observe two inversions, 17 duplications(five tandem or nearly tandem), seven transpositions, and 200deletions of 100 bases or more. This includes 160 inversions and 75duplications or transpositions of length >100 kb. The frequenciesof these smaller events are not substantially higher in finishedportions in the assembly. Many of the smaller transpositions areprocessed pseudogenes; we define a ‘‘syntenic’’ subset of thealignments that excludes these and other small-scale transposi-tions. These alignments provide evidence that ⬇2% of the genesin the human兾mouse common ancestor have been deleted orpartially deleted in the mouse. There also appears to be slightly lessnontransposon-induced genome duplication in the mouse than inthe human lineage. Although some of the events we detect arepossibly due to misassemblies or missing data in the currentgenome sequence or to the limitations of our methods, most arelikely to represent genuine evolutionary events. To make theseobservations, we developed new alignment techniques that canhandle large gaps in a robust fashion and discriminate betweenorthologous and paralogous alignments.comparative genomics 兩 cross-species alignments 兩 synteny 兩 chromosomalinversion 兩 breakpointsEvolution creates new forms and functions from the interplayof reproduction, variation, and selection. There are manyt ypes of variation; the most common and well studied is thesubstitution of one base for another. Small insertions anddeletions are also quite common. L arge-scale insertions usuallyinvolve the duplication of part of the genome. These duplicationscan be the st arting point for the development of a new gene witha new function. The evolution of nonduplicated genes generallyis quite constrained by selection, because the existing function ofthe gene must be maintained. After duplication, one copy is freeto lose its original function and possibly assume a new function(1, 2). Deletion and rearrangement also play important roles inthe long-term evolution of genomes.This study examines patterns of variation observed at all scalesby c omparing the human and mouse genomes to each other.Human and mouse are at an excellent dist ance for studying allt ypes of variation. The genomes are still similar enough that it ispossible to align the majority of orthologous sequence at theDNA level (3) yet distant enough that a g reat deal of variationhas had the opportunity to accumulate.Chromosomal rearrangements of ⱖ1 megabase can be ob-served by comparing genetic maps between organisms (4) and bychromosome painting (5). Approximately 200 conserved blocksof synteny between human and mouse were discovered by geneorder comparisons before the genome sequences became avail-able, with recent estimates ranging from 98 (6) to 529 blocks (7),depending on details of definition and method. The lengthdistribution of synteny blocks was found to be consistent with thetheory of random breakage introduced by Nadeau and Taylor (8,9) before significant gene order data became available. In recentc omparisons of the human and mouse genomes, rearrangementsof ⱖ100,000 bases were studied by comparing 558,000 highlyc onserved short sequence alignments (average length 340 bp)within 300-kb windows. An estimated 217 blocks of conservedsynteny were found, formed from 342 conserved segments, withlength distribution roughly consistent with the random break agemodel (3). Subsequent analysis of these data found 281 con-served synteny blocks of size at least 1 megabase, with a fewthousand further ‘‘microrearrangements’’ within these blocks,about one per megabase (10).The most common variations are single-base transitions, thatis C兾T and G兾 A substitutions (11, 12). Single-base insertions anddeletions are also quite common, although they are rapidlyselected out of coding regions. Substitutions and small (⬍20-base) insertions and deletions can be studied in traditionalnucleotide alignments of homologous genomic sequences. Atraditional pairwise alignment consists of two segments ofgenomic DNA with gap characters put in to maximize thenumber of matching bases. A simple example isACAGTAACTCGGGAGACGTG---TCG-GAG.If the two sequences are derived from a common ancestor,then a mismatch can result f rom a substitution in either sequencerelative to their common ancestor. Similarly, an alignment gapc ould be caused either by an insertion in one sequence or adeletion in the other.At the heart of the pairwise alignment process is a scoringfunction that assigns positive values to matching nucleotides andnegative values to mismatches and gaps. Most modern programsuse what is called ‘‘affine’’ gap sc oring, where the first gapcharacter in a gap incurs a substantial ‘‘gap opening’’ cost, andeach subsequent gap character incurs a somewhat lesser ‘‘gapextension’’ cost. Because gaps are frequently more than a singlebase long, affine scoring schemes model the underlying biolog-ical processes much better than fixed gap scoring systems. Af finegap scores generally work fairly well for protein alignments,where gaps are rare and tend to be short but do not represent thef requency of longer gaps as well (13–15). Nucleotide alignments,particularly outside of coding regions, tend to require many moregaps than protein alignments, and some of the gaps can be toolarge to be found by traditional pairwise alignment programs(16). Furthermore, in


View Full Document

Stanford CS 374 - Evolution’s cauldron - Duplication, deletion, and rearrangement in the mouse and human genomes

Documents in this Course
Probcons

Probcons

42 pages

ProtoMap

ProtoMap

19 pages

Lecture 3

Lecture 3

16 pages

Load more
Download Evolution’s cauldron - Duplication, deletion, and rearrangement in the mouse and human genomes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Evolution’s cauldron - Duplication, deletion, and rearrangement in the mouse and human genomes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Evolution’s cauldron - Duplication, deletion, and rearrangement in the mouse and human genomes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?