Brandeis BCHM 104A - More than the sum of their parts

Unformatted text preview:

More than the sum of their parts:on the evolution of proteinsfrom peptidesJohannes So¨ding and Andrei N. Lupas*SummaryDespite their seemingly endless diversity, proteins adopta limited number of structural forms. It has been estimatedthat 80% of proteins will be found to adopt one of onlyabout 400 folds, most of which are already known. Thesefolds are largely formed by a limited ‘vocabulary’ ofrecurring supersecondary structure elements, often byrepetition of the same element and, increasingly, ele-ments similar in both structure and sequence are dis-covered. This suggests that modern proteins evolved byfusion and recombination from a more ancient peptideworld and that many of the core folds observed today maycontain homologous building blocks. The peptides form-ing these building blocks would not in themselves havehad the ability to fold, but would have emerged ascofactors supporting RNA-based replication and cataly-sis (the ‘RNA world’). Their association into largerstructures and eventual fusion into polypeptide chainswould have allowed them to become independent of theirRNA scaffold, leading to the evolution of a novel type ofmacromolecule: the folded protein. BioEssays 25:837–846, 2003. ß 2003 Wiley Periodicals, Inc.IntroductionProteins are the central agents of life and their evolution is theobject of intense study. An important reason for this interestlies in the extent to which we use inference from homology toexplore life, based on the study of model systems. Particularlyin molecular biology, searches for homologous relationshipsbased on sequence similarity have become a routine step togain clues about the function of a new gene.(1)Proteins are enormously diverse. Estimates of the numberof species on earth run into the millions and each speciescontains thousands of protein-coding genes. Though super-ficially different, these proteins often display substantialsimilarity in sequence and three-dimensional structure, sincemany are derived from a basic complement of autonomouslyfolding units (domains). This allows us to group proteins into ahierarchy of families, superfamilies, and folds. The basiccomplement of domains was already established to a largeextent at the time of the ‘last common ancestor’,(2)but somevery successful domains arose later within the bacteria,archaea, or eukaryotes and radiated into the other kingdomsby endosymbiosis or lateral transfer.Proteins change in the succession of generations throughrandom drift and natural selection (the molecular clock).(3)Most frequently, changes result from point mutations (whichvery rarely change the overall structure of the protein signi-ficantly, but see Refs. 4 and 5 for exceptions), insertions anddeletions. By these processes, proteins may become sodissimilar that their common origin cannot be detected fromtheir sequences, even though they may still fulfill fundamen-tally the same function. However, their structures diverge muchmore slowly, providing evidence of common ancestry longafter their sequence similarity has decayed.The protein complement of an organism is the result ofparental inheritance, acquisition (through lateral transfer,viruses, or mobile elements) and duplication. Duplication iscentral to the diversification of proteins. At the level of fullgenomes, duplication is an effective path to increased com-plexity, which has been taken repeatedly in the course ofevolution.(6–8)At the level of operons, duplication may lead tothe efficient evolution of novel pathways. At the level of singlegenes, duplication allows the emergence of systems withcomplex functionality, such as the vertebrate olfactory system,which is built on thousands of homologous G-protein-coupledreceptors. In each of these cases, the duplicated copies arefreed from the selective pressure to maintain function andin fact come under pressure to assume a novel selectablefunction in order to avoid extinction through mutationalinactivation.(9)Duplication, accompanied by gene fusion, is also essentialfor a variety of other processes that result in the generation ofnovel proteins, such as unequal recombination,(10)circularpermutation,(12,13)and domain shuf fling.(15)Unequal recom-bination is the primary mechanism that gives rise to repetitiveproteins;(10,11)an extreme case is the giant muscle protein,titin, which consists of hundreds of immunoglobulin domains.Circular permutation is the process by which N- and C-terminaldeletions in a duplicated protein can result in a structure thatappears to have its C-terminal part permuted to the NBioEssays 25:837–846, ß 2003 Wiley Periodicals, Inc. BioEssays 25.9 837Department of Protein Evolution, Max-Planck-Institute for Develop-mental Biology, Tu¨bingen, Germany.*Correspondence to: Andrei Lupas, Department of Protein Evolution,Max-Planck-Institute for Developmental Biology, Spemannstr. 35, D-72076 Tu¨bingen, Germany. E-mail: [email protected] 10.1002/bies.10321Published online in Wiley InterScience (www.interscience.wiley.com).Review articlesterminus. The importance of circular permutation for proteinevolution can be appreciated from the fact that at least 412 outof 3035 domains in proteins of known structure arose bycircular permutation.(14)Finally, domain shuffling(15)is themain mechanism for the rapid generation of novel domaincombinations. In eukaryotes, this mechanism enabled theburst of creativity in protein evolution, which accompaniedmetazoan radiation during the Cambrian and yielded manynovel proteins specific for multicellular organisms by combin-ing a limited set of modular domains.(16)For example, thevertebrate immune system uses a handful of domain types innearly endless variations in order to satisfy the extremelycomplex requirements of self–nonself recognition. Ironically,prokaryotes use the same mechanism to produce the vari-ability in their surface proteins required to evade the immunesystem. An important effect of domain shuffling is that proteinsthat are not homologous globally may well contain homo-logous domains. For this reason, protein classificationschemes build on domains, not on entire proteins.Domain classificationThe sequences and structures of domains reflect the evolu-tionary events that shaped them and retain the traces of theircommon ancestry. This is the basis of their classification intofamilies and superfamilies(17,18)in a way analogous to theclassification of organisms into genera and orders. Super-families are further grouped into


View Full Document

Brandeis BCHM 104A - More than the sum of their parts

Download More than the sum of their parts
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view More than the sum of their parts and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view More than the sum of their parts 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?