Stanford BIO 118 - Study Notes - D2670615

Home> Schools> Stanford University> Biology (BIO) > BIO 118> Study Notes

Stanford BIO 118 - Study Notes

Course Bio 118- Genetic Analysis of Biological Processes

Pages 9

Download Save

Unformatted text preview:

Periodicities in Sequence Residue Hydropathy and the Implications on Protein Folds Nancy Zhang March 2000 Biochemistry 118 I Introduction The deterministic folding of a polypeptide sequence into its convoluted 3 D structure is one of the most fascinating applications of nature s laws With the current growth in the size of protein sequence databases and the distribution of sequence analysis tools on the internet the classic problem of predicting a protein s structure from its amino acid sequence is becoming increasingly important Currently discounting homologies of over 35 identity there are over 40 000 protein sequences identified and yet only 4200 experimentally determined protein structures Being able to predict proteins structure from sequence is crucial to many fields of study such as ligand protein docking as well as to the understanding of protein function at the molecular level The underlying hypothesis that motivates prediction efforts is that the complex packing arrangements of the main chain and side chains atoms of a folded protein is uniquely determined by two factors its amino acid sequence and its folding environment This has been supported by numerous experiments 1 2 and is the foundation for current sequence analysis methods such as homology search multiple sequence alignment and motif identification These methods evolve around the idea that if two proteins are similar in sequence then the chances are high that the two proteins are similar in structure as well Despite decades of research the accuracy of current methods is only around 60 3 One of the main problems limiting the success of current prediction algorithms is that there are hidden variables effecting the protein folding mechanism that are not explicitedly accounted for in the algorithms Nonlocal residue interactions is one of these hidden variables to account for all such interactions would be impossible more on this later Solvent chain interactions is another hidden variable which many prediction algorithms often neglect It has been shown that the propensity of amino acids for a certain secondary structure is environment dependent and in particular is dependent on its solvent accessibility 4 5 Yet since the solvent accessibility of a residue in the chain depends on the final folded structure it is very hard to explicitly and fully acount for the solvent effect in structure prediction algorithms Although it is very hard to model the global characteristics of an amino acid sequence through pairwise interactions between residues it is possible to represent them as frequencies periodic patterns that span the entire sequence The discrete Fourier transform has been used to find such patterns in the hydropathic content of sequences and distinct frequencies in hydrophobicities have been identified to be strongly correlated with certain secondary structural elements 6 7 The possibility that the two important hidden variables solvent effect and non local interactions may be better represented in the frequency domain inspired the content of this paper Can we use sequence alignments in the frequency domain to predict structural similarities between proteins Do two proteins that are similar in structure necessarily have similar peak patterns in their hydrophobicity plots In section II of the paper I will give a more detailed description of the solvent effect and explain why it is crucial to a protein s fold In section III I will explain how the Fourier transform simplifies the task of representing global sequence characteristics and argue benefits of sequence analysis in the frequency domain Finally in sections IV and V I will describe the procedures and results of an experiment in which I tried to find a correlation between the structural distance of proteins and the distance in their frequency domain hydrophathy plots 1 II Solvent Effects on the Protein Folding Process Although some protein structure prediction methods account for the hydropathic characteristics of the amino acids in their scoring functions It is not yet sure how to explicitly model the solvent effects into fold prediction algorithms However studies have shown that the solvent plays a major role in the folding process Just as a ball sliding along a rolling terrain the folding chain continuously seeks for a local minimum in conformational free energy given by the equation G H T S In vacuo the nonconvalent binding energies between residues compete with chain entropy Gchain Hchain T Schain However when the native aqueous environment of the protein is taken into account the equation becomes much more complicated Gtotal Hchain T Schain Hsolvent T Ssolvent The following table 8 shows the relative magnitudes of each for a folding chain in different environments Gtotal Gtransfer T Schain Hchain T Ssolvent Hsolvent Polypeptide chain in vacuum Nonpolar groups of chain in aqueous solvent In the table Gtransfer is the change in free energy in transfering a nonpolar side chain from water into the protein interior It is clear that in an aqueous environment the energy gain from the interaction between side chain and solvent G transfer accounts for a large contribution to protein stability Moreover the interaction between chain and solvent are of utmost importance in protein folding elucidated by the fact that almost all proteins denature in ethanol or in aqueous urea 8 The interaction between the peptide chain and the aqueous solvent depends on the hydropathic character of the residues in the chain Amino acids with non polar side chains such as methionine and valine energetically prefer to reduce their contact with water while those with charged and polar side chains generally prefer to be immersed in the aqueous solvent Thus amino acids with hydrophobic side chains tend to be buried in the internal core of a globular protein while those with hydrophilic side chains tend to reside on the surface This tendency to minimize the accessible surface area of hydrophobic particles and maximize that of the hydrophilic particles is a major driving force in protein folding Various scales have been developed to measure the hydrophobicity hydrophilicity of each of the twenty amino acids Some scales such as that of Janin 9 and Rose et al 10 are derived from examining proteins with known 3 D structure and defining the hydrophobic character of an amino acid as its tendency to be in the protein core as opposed to be on the surface while others such as that of Wolfenden et al

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford BIO 118 - Study Notes

Sign up for free to view:

Please select your school