Introduction to molecular evolution Lecture 13 Statistics 246 March 4 2004 1 Evolution using molecules implicit assumptions Our DNA is inherited from our parents more or less unchanged Molecular evolution is dominated by mutations that are neutral from the standpoint of natural selection Mutations accumulate at fairly steady rates in surviving lineages We can study the evolution of macro molecules and reconstruct the evolutionary history of organisms using their molecules 2 Some important dates in history billions of years ago Origin of the universe 15 4 Formation of the solar system 4 6 First self replicating system 3 5 0 5 Prokaryotic eukaryotic divergence 1 8 0 3 Plant animal divergence 1 0 Invertebrate vertebrate divergence 0 5 Mammalian radiation beginning 0 1 86 CSH Doolittle et al 3 The three kingdoms 4 Two important early observations Different proteins evolve at different rates and this seems more or less independent of the host organism including its generation time It is necessary to adjust the observed percent difference between two homologous proteins to get a distance more or less linearly related to the time since their common ancestor Later we offer a rational basis for doing this An striking early version of these observations is next 5 g h i V ertebrates Insects Carp Lamprey f Reptiles Fish abcde Mammals Reptiles 200 Birds Reptiles 220 Mammals Corrected amino acid changes per 100 residues Rates of macromolecular evolution j 10 160 140 120 Fibrin op 1 1 M eptides Y 180 6 5 o gl bi 7 89 Evolution of the globins n o m MY e 8 H 5 100 80 60 4 Pliocene Miocene Oligocene Eocene Paleocene 100 200 300 Algonkian Cambrian Devonian Silurian Ordovician Jurassic Triassic Permian Cretaceous 0 Carboniferous 20 3 2 400 500 600 Separation of ancestors of plants and animals Huronian 40 1 c rome h c o t Cy MY 20 0 700 800 900 1000 1100 1200 1300 1400 Millions of years since divergence After Dickerson 1971 6 Different rates of change for different proteins Protein aPAMs 100 Pseudogenes Fibrinopeptides Lactalbumins Lysozymes Ribonucleases Hemoglobins Acid proteases Triosephosphate isomerase Phosphoglyceraldehyde dehydrogenase Glutamate dehydrogenase residues 108 years Theoretical lookback timeb 400 90 27 24 21 12 8 3 2 1 45c 200c 670c 750c 850c 1 5d 2 3d 6d 9d 18d aPAMs Accepted point mutations explained shortly bUseful lookback time 360 PAMs cMillion years dBillion years From Doolittle 1986 7 Rates of change in protein families Protein Ratea Protein Fibrinopeptides Growth hormone Ig kappa chain C region Kappa casein Ig gamma chain C region Lutropin beta chain Ig lambda chain C region Complement C3a Lactalbumin Epidermal growth factor Somatotropin Pancreatic ribonuclease Lipotropin beta Haptoglobin alpha chain Serum albumin Phospholipase A2 Protease inhibitor PST1 type Prolactin Pancreatic hormone Carbonic anydrase C Lutropin alpha chain Hemoglobin alpha chain Hemoglobin beta chain Lipid binding protein A II Gastrin Animal lysozyme Myoglobin Amyloid A Nerve growth factor Acid proteases Myelin basic protein 90 37 37 33 31 30 27 27 27 26 25 21 21 20 19 19 18 17 17 16 16 12 12 10 9 8 9 8 8 9 8 7 8 5 8 4 7 4 Thyrotropin beta chain Parathyrin Parvalbumin BPTI Protease inhibitors Trypsin Melanotropin beta Alpha crystallin A chain Endorphin Cytochrome b5 Insulin Calcitonin Neurophysin 2 Plastocyanin Lactate dehydrogenase Adenylate cyclase Triosephosphate isomerase Vasoactive intestinal peptide Corticotropin Glyceraldehyde 3 P DH Cytochrome C Plant ferredoxin Collagen Troponin C skeletal muscle Alpha crystallin B chain Glucagon Glutamate DH Histone H2B Histone H2A Histone H3 Ubiquitin Histone H4 apercent 100My From Nei 1987 Dayhoff et al 1978 Rate 7 4 7 3 7 0 6 2 5 9 5 6 5 0 4 8 4 5 4 4 4 3 3 6 3 5 3 4 3 2 2 8 2 6 2 5 2 2 2 2 1 9 1 7 1 5 1 5 1 2 0 9 0 9 0 5 0 14 0 1 8 0 1 Some terminology In evolution homology here of proteins means similarity due to common ancestry A common mode of protein evolution is by duplication Depending on the relations between duplication and speciation dates we have two different types of homologous proteins Loosely Orthologues the same gene in different organisms common ancestry goes back to a speciation event Paralogues different genes in the same organism common ancestry goes back to a gene duplication Lateral gene transfer gives another form of homology 9 Beta globins orthologues 10 BG human BG macaque BG bovine BG platypus BG chicken BG shark M V H M L W W T S S P A G A E E G V E K L S N A Q H 20 A L E V I I T A T N G T L F T W G K K S V I 50 BG human BG macaque BG bovine BG platypus BG chicken BG shark R F Y F E A G S A N F L G K D N E L F S T N D D D L I V K G N S T Q F A S T T Q A K Q D L S T S S S A P A A C D G T S A Y V I G M L G N N K D E D D K F T S A P P Q V E E Q V L C T Q A A D I D N A H E S V L C L G G A A E K A L P K M E K L K H C A D E K E L H V D G A A 40 R L M L F V I I V Y P W T 70 V K R A E H G A K A K V 100 130 BG human BG macaque BG bovine BG platypus BG chicken BG shark V I K 60 90 BG human BG macaque BG bovine BG platypus BG chicken BG shark N K D 30 L T G D T T 80 A S S S F L S G G G D N V G A A A L M V V A N K K K T H N N 110 P V E N S F R K K N K L R L A L I A G S H R K K R E Y H G A N D K V I C Q T L D G 120 L F V I I C V V I V V E L A G H R R A I H N L F L G S S K 140 Y F W W W Q E K V L L Y V F A S R G G V V V A V N H H D means same as reference sequence means deletion 10 Beta globins uncorrected pairwise distances DISTANCES between protein sequences calculated over 1 to 147 Below diagonal observed number of differences Above diagonal number of differences per 100 amino acids hum mac bov pla chi …
View Full Document