Evolutionary Rate Heterogeneity in Proteins with Long Disordered RegionsOutlineProtein Evolution–Patterns of Evolution–Rates of EvolutionDisordered ProteinDNAAGCTRNAAGCU20 amino acidsProteinPhenotypeSerine-Threonine PhosphataseMutations vs SubstitutionsNot all mutations persist over time• Selection• Random lossDetecting SubstitutionshADH-G CCTCCTAAGGCTCATGAAGTTCGCATTAAGhADH-A CCTCCTAAGGCTCATGAAGTTCGCATTAAGhADH-B CCTCCTAAGGCTTATGAAGTTCGCATTAAGbADH-B CCTCCTAAGGCTTATGAAGTTCGCATTAAGChange in Protein SequencehADH-G PPKAHEVRIK hADH-A PPKAHEVRIKhADH-B PPKAYEVRIK bADH-B PPKAYEVRIKPatterns of EvolutionModeling Protein EvolutionhADH-A PPKAHEVRIKhADH-B PPKAYEVRIKSubstitution MatricesSerine-Threonine PhosphataseUses of Substitution MatricesAligning sequencesFinding sequence homologuesCalculating genetic distancesGenetic Distance0.1hADH-A PPKAHEVRIKhADH-B PPKAYEVRIKSubstitution MatricesREMARDPV 26 2 -2 1D E=R D P V R E M AREMARDPVRates of EvolutionSubstitution rate heterogeneity among proteinsSubstitution rate heterogeneity within proteinsVariation among ProteinsDickerson, 1971, J. Mol. Evol. 1:26millions of years since divergence of species being comparedVariation within ProteinsLi, 1997, Molecular EvolutionEstrogen ReceptorDisordered ProteinProtein with no fixed secondary and/or tertiary structure that occurs as an ensemble of structuresLac Repressor DNA binding domainW C F I Y V L H M A T R G Q S N P D E K-1-0.500.51(Disorder – Order) / Orderdis XRAY (2844 aa)dis NMR (4019 aa)dis CD (10554 aa) Differences in Amino Acid CompositionDisorder and FunctionLinkers, spacers, bristles, clocks, springs, detergents, self-transportAcetylation, fatty acylation, glycosylation, methylation, phosphorylation, ADP-ribosylation, ubiquitination, proteolytic digestionGlobular oligomers, linear polymers, hetero complexes, phages, viruses Inter- and Intra-protein, ssDNA, dsDNA, tRNA, rRNA, mRNA, nRNA, bilayers, ligands, co-factors, metalsDescriptionEntropic ChainsProtein ModificationMolecular Assembly / DisassemblyMolecular RecognitionCategoryUnknown17 36>13114Number8020406080>=30 >=40 >=50 >=60 >=70 >=80 >=90 >=100% of proteinsconsecutive disorder predictions Cancer-associated proteins Signaling proteins SwissProt O_PDB Select25 PONDR Disorder EstimatesProteins with Disorder and Rate HeterogeneityCalcineurinTopoisomerase IIRibosomal Protein S4Voltage-gated potassium channelsFlagellinIdentify disordered proteinIdentify homologous proteinsPDB, PubMedBLAST Align homologuesCLUSTALW DOCreate aligned Order and Disorder setsCreating Disordered Protein FamiliesCalculating Pair-wise Genetic DistanceD340.080.25 0.290.35 0.38 0.42D =Calculate average difference between genetic distances0.080.25 0.290.35 0.38 0.42D =0.050.2 0.250.25 0.27 0.35O =Δ = {(0.05 - 0.08) + (0.2- 0.25) + (0.25 - 0.29) + (0.25 - 0.35) + (0.27 - 0.38) + (0.35 - 0.42)} / 6 = -0.07Calculating Average Differences, ΔHypothesis TestH0: Any given position in the alignment is as likely to be in a region of disorder as in a region of orderRandom AssignmenthADH-G PPKAHEVRIK hADH-A PPKAHEVRIKhADH-B PPKAYEVRIK bADH-B PPKAYEVRIK P(D) = #Disordered residues/totalP(O) = 1- P(D)OrderPPAHEKPPAHEKPPAYEKPPAYEKDisorderKVRIKVRIKVRIKVRI0150300450600750900Δ0FrequencyRPATBSV-1-0.8-0.6-0.4-0.200.20.4Sampling DistributionsDisordered Protein Families26 Families with DR > 30 amino acids–17 by X-ray crystallography–6 by NMR–1 by both X-ray and NMR–2 by circular dichroism and limited proteolysisFamily size from 4 to 80 sequencesDifferences in Rates of EvolutionDisorder > Order 19Disorder = Order 5Disorder < Order 2A Rapidly Evolving Disordered RegionFlexibility and ConservationRNL AGEVGVKIGNPVPYNEGHAQQQAVSAPASAATPPASKPQPQNGSLGVGSTVAKAYGASKPFGKPAGTGLLQPTSGTHSL AEAVGVKIGNPVPYNEGLGQPQVAPPAPAASPAASSRPQPQNGSSGMGSTVSKAYGASKTFGKAAGPSLSHTSGGTNCL LGCPEKMGDPQPLGPRSAEPQQNPNLGSTGFYGVKSEPTQDTKPQFPRQMPSRNASGGQGSSTATL ETIGNPTIFGETDTEAQKTFSGTGNIPPPNRVVFNEPMVQHSVNRAPPRGVNIQNQANNTPSFRPSVQPSYQPPASYRNHGPIMKNEAOSL LEVVFKALDSEIKCEAEKQEEKPAILLSPKEESVVLSKPTNAPPLPPVVLKPKQEVKSASQIVNEQRGNAAPAARL02468101 10 20 30 40 50 60 70 80 90HSL J(0) vnRNL J(0) vnOSL J(0) vnATL J(0) vnJ(0) (ns)RNL AGEVGVKIGNPVPYNEGHAQQQAVSAPASAATPPASKPQPQNGSLGVGSTVAKAYGASKPFGKPAGTGLLQPTSGT HSL AEAVGVKIGNPVPYNEGLGQPQVAPPAPAASPAASSRPQPQNGSSGMGSTVSKAYGASKTFGKAAGPSLSHTSGGT NCL LGCPEKMGDPQPLGPRSAEPQQNPNLGSTGFYGVKSEPTQDTKPQFPRQMPSRNASGGQGSST ATL ETIGNPTIFGETDTEAQKTFSGTGNIPPPNRVVFNEPMVQHSVNRAPPRGVNIQNQANNTPSFRPSVQPSYQPPASYRNHGPIMKNEA OSL LEVVFKALDSEIKCEAEKQEEKPAILLSPKEESVVLSKPTNAPPLPPVVLKPKQEVKSASQIVNEQRGNAAPAARL Compact Globular Protein00.20.40.60.81AAAAAAAAAEGEEEEEEDDEAAVVAAEVVVVVVVVVGGGGGGGGGLVVVVVSLGKKKKKKKKKIIIIIIIIIGGGGGGGGGNNNNNNNNNPPPPPPPPPVVVVVVVVVPPPPPPPPPYYYYYYYYYNNNNNNNNNEEEEEEEEEGGGGGGGGGYHLLLHHHQGAGGGGGGGQQQQQQQQQQQPPPPQQQAAVVVAVAASAPPPPPSAPPPPPPPPPAAAAAVVVSSSPPPSSSPAAAAAAAAAAAAAAGTATTTSSSSSNNPPPPPPPPPNNNNNHNNNGGGGGGGGGSSSSTSSSSLLSSSSPSSGGGGGGGAGMVMVAAMMLGGGGGGAGGSSSSSSSFSTTTTTTTTTAVVVVVAAAAASSSSSSTKKKKKKKKKAAAAAAATSYYYYYYFYFGGGGGGGSGAAAAAAAAGSSSSSSSSSKKKKKKKKKPPTTTTTTTFFFFFFFFFGGGGGGGGGKKKKKKKKKPPAAAAAPVAAAAAGGGGGGGGGGGGNTTPPPPTTAGGSSSSSSSQQHHHNNSSPPTTNTSSTSTSSSSSSPGSGGGGGGGGGGGGGGGGTTTTTTTTSNeutral vs Purifying SelectionFunctions of Rapidly Evolving Disordered RegionsReplication protein A, Topo II 2Entropic ChainsTopo II, Gonadotropin, Bcl-xL 3Protein ModificationNF-KB, RGS4, Topo II, Calcineurin, cFos, Thyroid TF, F-tRNA synthetase, TBSV & SV coat proteins, Histone H5, Telomere BP 12Molecular RecognitionProteinsNumberCategoryG-tRNA synthetase, Sulfotransferase, Cytochrome BC1, DNA-lyase 4UnknownFunctions where Disorder = OrderEpidermal growth factor1UnknownGlycine N-methyltransferase 1AutoregulatorySmall HSP, Prion, SBMV coat protein3Molecular RecognitionProteinsNumberCategoryFunctions of Slowly Evolving Disordered Regions ssDNA BP1Entropic ChainsFlagellin 1Molecular AssemblyssDNA BP, Flagellin2Molecular RecognitionProteinsNumberCategoryWhy was Order Faster?The ordered region of Flagellin is antigenicMultiple functions or specificity of binding of ssDNA BP?Why was Disorder Faster?Differences in amino acid compositionLow complexity sequences evolve rapidly, and often disordered proteins are low complexityDisorder has no function and therefore evolves at a neutral rateWhy was Disorder Faster?Fewer constraints because there are many ways to be
or
We will never post anything without your permission.
Don't have an account? Sign up