1Pairwise sequence alignment (global and local)Multiple sequence pqalignmentlocalglobalSubstitution matricesDatabase searchingBLASTSequence ttitiEvolutionary tree reconstructionRNA structure predictionGene FindingProtein structure predictionstatisticsComputational genomics…Global Multiple Sequence Alignment HUMAN MKWVTFISLL FLFSSAYSRG V..FRRDA.H KSEVAHRFKD LGEENFKALVSSSSG S GGRABIT MKWVTFISLL FLFSSAYSRG V..FRREA.H KSEIAHRFND VGEEHFIGLVPIG ~~WVTFISLL FLFSSAYSRG V..FRRDT.Y KSEIAHRFKD LGEQYFKGLVCHICK MKWVTLISFI FLFSSATSRN LQRFARDAEH KSEIAHRYND LKEETFKAVAAlign k sequences, so that residues in each column gqshare a property of interest:– a common ancestor– a structural or functional roleApplicationsGlobal Multiple Alignment HUMAN MKWVTFISLL FLFSSAYSRG V..FRRDA.H KSEVAHRFKD LGEENFKALVRABIT MKWVTFISLL FLFSSAYSRG V..FRREA.H KSEIAHRFND VGEEHFIGLVPIG ~~WVTFISLL FLFSSAYSRG V..FRRDT.Y KSEIAHRFKD LGEQYFKGLVCHICK MKWVTLISFI FLFSSATSRN LQRFARDAEH KSEIAHRYND LKEETFKAVA• Protein structure and function• RNA structure• Evolutionary tree reconstructionPDZ domain2A MULTIPLE SEQUENCE ALIGNMENT EXAMPLERib s : RNA/ t i lRibosome: an RNA/protein complexrpS14: a ribosomal protein in yeastGoal: Determine residues responsible for binding rpS14 to ribosomal RNAKnown: –Sequence of rpS14–Sequence of rpS14– Structure of homolog in bacteria– Sequences in many speciesPam Bush, PhD CMU, 02A MULTIPLE SEQUENCE ALIGNMENT EXAMPLES li Strategy: alanine scan–Find “likely” candidate amino acids– Replace candidates with alanine– Hope that•Alanine preserves structureAlan ne preserves structure• Alanine will destroy bindingT. thermophilus ------------MAK KPSKKKVKRQVASGR AYIHASYNNTIVTIT DPDGNPITWSSGGVI GYKGSR-KGTPYAAQA. aeolicus --------------M AKKKKKQKRQVTKAI VHIHTTFNNTIVNVT DTQGNTIAWASGGTV GFKGTR-KSTPYAAQSeventeen residues are conserved in rpS14P. aeruginosa ----------MAKPA ARPRKKVKKTVVDGI AHIHASFNNTIVTIT DRQGNALSWATSGGS GFRGSR-KSTPFAAQE. coli ----------MAKAP IRARKRVRKQVSDGV AHIHASFNNTIVTIT DRQGNALGWATAGGS GFRGSR-KSTPFAAQH. sapiens MAPRKGKEKKEEQVI SLGPQVAEGENVFGV CHIFASFNDTFVHVT DLSGKETICRVTGGM KVKADRDESSPYAAMD. melanogaster MAPRKAKVQKEEVQV QLGPQVRDGEIVFGV AHIYASFNDTFVHVT DLSGRETIARVTGGM KVKADRDEASPYAAMS. pombe ------------MAT NVGPQIRSGELVFGV AHIFASFNDTFVHIT DLTGKETIVRVTGGM KVKTDRDESSPYAAMS. cerevisiae -------------MA NDLVQARDNSQVFGV ARIYASFNDTFVHVT DLSGKETIARVTGGM KVKADRDESSPYAAMS. solfataricus --------------- ----MSSRREIRWGI AHIYASQNNTLLTIS DLTGAEIISRASGGM VVKADREKSSPYAAMM. jannaschii --------------- ----MAEQKKEKWGI VHIYSSYNNTIIHAT DITGAETIARVSGGR VTRNQRDEGSPYAAMT. thermophilus LAALDAAKKAMAYGM QSVDVIVRG------ --TGAGREQAIRALQ ASGLQVKSIVDDTPV PHNGCRPKKKFRKAS-A. aeolicus LAAQKAMKEAKEHGV QEVEIWVKG------ --PGAGRESAVRAVF ASGVKVTAIRDVTPI PHNGCRPPARRRV---P. aeruginosa VAAERAGQAALEYGL KNLDVNVKG------ --PGPGRESAVRALN ACGYKIASITDVTPI PHNGCRPPKKRRV---E coliVAAERCADAVKEYGI KNLEVMVKG------ --PGPGRESTIRALN AAGFRITNITDVTPI PHNGCRPPKKRRV---E. coliVAAERCADAVKEYGI KNLEVMVKGPGPGRESTIRALN AAGFRITNITDVTPI PHNGCRPPKKRRVH. sapiens LAAQDVAQRCKELGI TALHIKLRATGGNRT KTPGPGAQSALRALA RSGMKIGRIEDVTPI PSDSTRRKGGRRGRRLD. melanogaster LAAQDVAEKCKTLGI TALHIKLRATGGNKT KTPGPGAQSALRALA RSSMKIGRIEDVTPI PSDSTRRKGGRRGRRLS. pombe LAAQDAAAKCKEVGI TALHIKIRATGGTAT KTPGPGAQAALRALA RAGMRIGRIEDVTPI PTDSTRRKGGRRGRRLS. cerevisiae LAAQDVAAKCKEVGI TAVHVKIRATGGTRT KTPGPGGQAALRALA RSGLRIGRIEDVTPV PSDSTRKKGGRRGRRLS. solfataricus LAANKAASDALEKGI MALHIKVRAPGGYGS KTPGPGAQPAIRALA RAGFIIGRIEDVTPI PHDTIRRPGGRRGRRVM. jannaschii QAAFKLAEVLKERGI ENIHIKVRAPGGSGQ KNPGPGAQAAIRALA RAGLRIGRIEDVTPV PHDGTTPKKRFKK---Criteria for selecting conserved amino acids:-Conserved among all three phylogenetic groupsConserved among all three phylogenetic groups- Conserved in at least 90% sequences analyzed,allowing for conservative substitution K↔R- No residue could be an alanine, proline or glycine3R85Eight conserved residues are on the surface of the bacterial protein interacting with the rRNAR 53K 50N 25V 119 T120R 128R 85R 128R134T. thermophilus ------------MAK KPSKKKVKRQVASGR AYIHASYNNTIVTIT DPDGNPITWSSGGVI GYKGSR-KGTPYAAQA. aeolicus --------------M AKKKKKQKRQVTKAI VHIHTTFNNTIVNVT DTQGNTIAWASGGTV GFKGTR-KSTPYAAQ…and these are distributed throughout the protein sequenceP. aeruginosa ----------MAKPA ARPRKKVKKTVVDGI AHIHASFNNTIVTIT DRQGNALSWATSGGS GFRGSR-KSTPFAAQE. coli ----------MAKAP IRARKRVRKQVSDGV AHIHASFNNTIVTIT DRQGNALGWATAGGS GFRGSR-KSTPFAAQH. sapiens MAPRKGKEKKEEQVI SLGPQVAEGENVFGV CHIFASFNDTFVHVT DLSGKETICRVTGGM KVKADRDESSPYAAMD. melanogaster MAPRKAKVQKEEVQV QLGPQVRDGEIVFGV AHIYASFNDTFVHVT DLSGRETIARVTGGM KVKADRDEASPYAAMS. pombe ------------MAT NVGPQIRSGELVFGV AHIFASFNDTFVHIT DLTGKETIVRVTGGM KVKTDRDESSPYAAMS. cerevisiae -------------MA NDLVQARDNSQVFGV ARIYASFNDTFVHVT DLSGKETIARVTGGM KVKADRDESSPYAAMS. solfataricus --------------- ----MSSRREIRWGI AHIYASQNNTLLTIS DLTGAEIISRASGGM VVKADREKSSPYAAMM. jannaschii --------------- ----MAEQKKEKWGI VHIYSSYNNTIIHAT DITGAETIARVSGGR VTRNQRDEGSPYAAMT. thermophilus LAALDAAKKAMAYGM QSVDVIVRG------ --TGAGREQAIRALQ ASGLQVKSIVDDTPV PHNGCRPKKKFRKAS-A. aeolicus LAAQKAMKEAKEHGV QEVEIWVKG------ --PGAGRESAVRAVF ASGVKVTAIRDVTPI PHNGCRPPARRRV---P. aeruginosa VAAERAGQAALEYGL KNLDVNVKG------ --PGPGRESAVRALN ACGYKIASITDVTPI PHNGCRPPKKRRV---E coliVAAERCADAVKEYGI KNLEVMVKG------ --PGPGRESTIRALN AAGFRITNITDVTPI PHNGCRPPKKRRV---E. coliVAAERCADAVKEYGI KNLEVMVKGPGPGRESTIRALN AAGFRITNITDVTPI PHNGCRPPKKRRVH. sapiens LAAQDVAQRCKELGI TALHIKLRATGGNRT KTPGPGAQSALRALA RSGMKIGRIEDVTPI PSDSTRRKGGRRGRRLD. melanogaster LAAQDVAEKCKTLGI TALHIKLRATGGNKT KTPGPGAQSALRALA RSSMKIGRIEDVTPI PSDSTRRKGGRRGRRLS. pombe LAAQDAAAKCKEVGI TALHIKIRATGGTAT KTPGPGAQAALRALA RAGMRIGRIEDVTPI PTDSTRRKGGRRGRRLS. cerevisiae LAAQDVAAKCKEVGI TAVHVKIRATGGTRT KTPGPGGQAALRALA RSGLRIGRIEDVTPV PSDSTRKKGGRRGRRLS. solfataricus LAANKAASDALEKGI MALHIKVRAPGGYGS KTPGPGAQPAIRALA RAGFIIGRIEDVTPI PHDTIRRPGGRRGRRVM. jannaschii QAAFKLAEVLKERGI ENIHIKVRAPGGSGQ KNPGPGAQAAIRALA RAGLRIGRIEDVTPV PHDGTTPKKRFKK---About the changes to alanine of the conserved residues:N25, K50, R53, R85, V119, T120, R128, R134- Like most ribosomal proteins, high percentage of basic residues- Change to alanine is a big charge difference- Alanines on surface usually do not have a big structural changeThe predicted secondary structure of rpS14 does not change significantly when the conserved residues are changed to
View Full Document