DOC PREVIEW
CU-Boulder PHYS 7450 - Second Codon Positions of Genes

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Second codon positions of genes and the secondary structures of proteins.Relationships and implications for the origin of the genetic codeMaria Luisa Chiusanoa,b, Fernando Alvarez-Valinc, Massimo Di Giuliod, Giuseppe D'Onofrioa,Gaetano Ammiratob, Giovanni Colonnab, Giorgio Bernardia,*aLaboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Villa Comunale, I-80121 Naples, ItalybCentro di Ricerca Interdipartimentale di Scienze Computazionali e Biotecnologiche, Seconda UniversitaÂ, via Costantinopoli 16, 80138 Naples, ItalycSeccioÂn BiomatemaÂtica, Facultad de Ciencias, IguaÂ4225, Montevideo 11400, UruguaydInternational Institute of Genetics and Biophysics, CNR, via G. Marconi 10, 80125 Naples, ItalyAccepted 30 October 2000Received by T. GojoboriAbstractThe nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond todifferent secondary structures in the encoded proteins, namely, helix, b-strand and aperiodic structures. Indeed, hydrophobic and hydrophilicamino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the b-strand structure is stronglyhydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and proteinsecondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of thegenetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleteriousamino acid substitutions that could modify the physico-chemical properties required for an optimal structure. q 2000 Elsevier Science B.V.All rights reserved.Keywords: Genetic code; Nucleotide frequencies; Protein secondary structures; Second codon positions1. IntroductionDifferent secondary structures of proteins exhibit remark-able differences in amino acid frequencies (Szent-Gyorgyiand Cohen, 1957; Guzzo, 1965; Havsteen, 1966; Prothero,1966; Cook, 1967; Goldsack, 1969; Chou and Fasman,1974; Levitt, 1978). Some amino acids are, indeed, moreprone to be present in speci®c secondary structures whileothers tend to disrupt them (Bahar et al., 1997). Propensitiesof amino acids for a secondary structure correlate with theirphysico-chemical properties. These properties haveprovided the basic information used in prediction methods.Protein secondary structures re¯ect the physico-chemicalproperties of the most frequent amino acids in those struc-tures. For example, the b-strand structure is strongly hydro-phobic, while aperiodic structures contain more hydrophilicamino acids. Therefore, constraints on the secondary andtertiary structures tend to limit accepted mutations tothose in which an amino acid is replaced by anotheramino acid with similar properties (Epstein, 1967; Gran-tham, 1974; Chirpich, 1975; Zhang, 2000).Several investigations have addressed the possible corre-lation between the nucleotides at each codon position andthe properties of amino acids (Goodman and Moore, 1977;Wolfenden et al., 1979; Sjostrom and Wold, 1985; Bernardiand Bernardi, 1986; Taylor and Coates, 1989; Tolstrup etal., 1994). In particular, hydrophobic amino acids areencoded by codons having U in the second position, whilehydrophilic amino acids are encoded by triplets with A inthe second position. However, previous attempts to linkprotein secondary structures to the organisation of thegenetic code (see Di Giulio, 1996) have been unsuccessful(Salemme et al., 1977; Goodman and Moore, 1977), exceptin the case of b-turns (Jurka and Smith, 1987) and b-strands(Di Giulio, 1996). Recently Gupta et al. (2000) havereported that the average frequencies of U and A at thesecond codon position are markedly different between a-helix and b-strand, but these authors did not perform anymore detailed analysis.The present work shows that the nucleotide distributionsGene 261 (2000) 63±690378-1119/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.PII: S0378-1119(00)00521-7www.elsevier.com/locate/gene* Corresponding author. Tel.: 133-081-5833215; fax: 139-081-5833402.E-mail address: [email protected] (G. Bernardi).in second codon positions are strongly related to the averagephysico-chemical properties of protein secondary structure,and this relationship sheds light on the origin of the geneticcode.2. Materials and methodsTwo data sets (available upon request) of experimentallydetermined structures comprising 77 human and 232 prokar-yotic proteins were used in order to investigate the relation-ships between base composition of coding sequences andthe secondary structures of the encoded proteins. Completecoding region information was available only for the humandata set (Adzhubei et al., 1998). Thus, serine was excludedfrom the analysis in the case of prokaryotic proteins,because of its ambiguity in the second codon position(UCN or AGY). The secondary structures, assigned by theDSSP program (Kabsch and Sander, 1983), were describedin terms of b-strand, helix (including 310helices and a-helices), and aperiodic structure (including the turn structureand the protein segments that are not de®ned and/or lackperiodicity). The average hydrophobicity levels, calculatedby the Gravy scale (Kyte and Doolittle, 1982), and molecu-lar weights of the amino acids in each of the three structureswere also calculated.3. ResultsIn Table 1, we report the mean values of the nucleotidefrequencies at second codon positions in human and prokar-yotic proteins in coding regions corresponding to differentsecondary structures.The three structures show marked differences in thefrequency of U in second codon position (U2), the aperiodicstructure showing the lowest values, the b-strand structurethe highest ones in both groups of organisms. A2is alsodifferent among the three structures, with higher values inhelix and aperiodic structures. G2and C2have consistentlylower values in all the three structures compared to A2andU2, with higher ®gures in aperiodic structure in comparisonto both helix and b-strand structures.The differences in nucleotide frequency in the secondcodon positions can be accounted for by the differentamino acid composition in the three structures (see Table2). As expected, the amino acids have different propensitiesfor each structure. Interestingly, all amino acids having U inthe second positions exhibited a clear hierarchy, the


View Full Document

CU-Boulder PHYS 7450 - Second Codon Positions of Genes

Documents in this Course
Load more
Download Second Codon Positions of Genes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Second Codon Positions of Genes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Second Codon Positions of Genes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?