Roopal Sampat Biochemistry 118Q Professor Doug Brutlag 3 8 99 Probabilistic Approaches to Predicting the Secondary Structure of Proteins Today s increasingly unaffordable medical treatment forces genomic research to have far reaching consequences Most members of the public do not realize that the genetic sequence does not only encode information about hereditary make up but that it also contains the necessary blueprints for the structural formation of essential proteins As the central dogma of molecular biology declares DNA is transcribed into RNA which is then translated into amino acids that make up proteins Malfunctions in these proteins result in phenotypes that may be classified as diseases As health care functions today doctors assess symptoms resulting in a diagnosis for a disease Physicians must make educated guesses based upon the symptoms and run a series of tests the process of which may sometimes prove impractical or extremely expensive Bioinformatics has emerged as providing a new perspective for the treatment of genetically inherited diseases The central paradigm of bioinformatics states that genetic information can be used to predict molecular structure of proteins and the function of these proteins can then be determined providing a cause for symptoms of a disease If the structure and function of every protein encoded by DNA were known the underlying causes of symptoms could be easily pinpointed Elucidating these structures however is a process that could occupy scientists for hundreds of years As a result much research has been and continues to be done regarding the prediction of secondary structures of proteins based upon determined amino acid sequences X ray crystallography has been the traditional method for determining the structure of a protein Protein samples are crystallized and a fine beam of x rays is targeted at them The x ray diffraction detected is then used to generate a model of the electron density of the protein Several disadvantages however exist to using x ray crystallography First of all the crystallization of proteins is usually a difficult and time consuming process that requires a great deal of skill Secondly x ray diffraction provides a static model of protein structure with atoms and molecules mapped in fixed space Although this representation is useful proteins do not usually acquire a fixed structure and instead are continuously bending and shifting characteristics that may be crucial to the function of the protein Thirdly the time needed to crystallize and x ray much less identify every single protein that is encoded by the genetic sequence could span centuries of work As a result scientists would prefer to be able to accurately predict structure rather than actually determining it The prediction of protein structure from the amino acid sequence is a work in progress Scientists are cataloguing and using the known structures of thousands of proteins to help them through this process The Protein Data Bank or PDB is maintained through Brookhaven National Laboratory As of March 3 1999 the PDB holds 9419 coordinate entries of which 8751 are proteins 656 are nucleic acids and 12 are carbohydrates Protein Data Bank These structures are classified into groups the most general of which being the Class and major structural similarities place proteins in the same Fold category some degree of sequence similarity implies a probable common ancestry and puts proteins in the same Superfamily and greater than 25 percent sequence similarity demonstrates a clear evolutionary ancestry which places proteins in the same Family Brutlag lecture 2 1 The classification of proteins into such groups aids in understanding and attempting to predict protein structures by allowing easy observation of and comparisons between patterns in amino acid sequences The Asilomar conferences of 1994 and 1996 discussed four approaches to secondary structure prediction The first is homology modeling Two proteins are generally agreed to have the same structure if their sequences are 25 30 percent homologous Brutlag lecture 2 1 This approach utilizes knowledge of a closely related protein to predict the structure of a protein in question If the sequences and or structures of no closely related proteins are known however ab initio prediction appeals as a second approach Ab initio methods attempt to predict secondary structure through knowledge of only the amino acid sequence of the protein in question Altman lecture One ab initio method that has been worked on is determining the lowest energy configuration possible determined through a hidden Markov models and computer modeling using the given sequence of amino acids Such an approach however has not proven successful beyond predicting the secondary structure of small proteins because naturally occurring proteins often do not exist in their minimum energy configuration for reasons that may or may not be known Brutlag lecture 2 4 For proteins that have some 25 percent sequence homology with known structures a third approach to structure prediction is taken Fold recognition utilizes knowledge of existing structures to hypothesize whether or not new sequences could acquire such structures Altman lecture Predicting how two proteins fit together is done by the fourth approach protein docking The geometry of the physical association between two proteins is predicted by studying the surface to surface interactions to determine the best way in which they would fit together Homology and ab initio are the two current methods that will be concentrated on as efforts towards protein structure prediction Most efforts at predicting secondary structure concentrate on predicting the state of an amino acid in the center of a local window of residues Schmidler Because the twenty amino acids do not occur in equal distribution in proteins the beginnings of structure prediction attempted to utilize the frequency of an amino acid s occurrence in different conformations For example proteins usually have low levels of methionine and tryptophan and higher levels of leucine and serine Stryer In particular however the amino acids do not have the same proportions in particular regions of a protein forms a secondary structure as they do in the protein overall The side chains on the amino acids can either promote or hinder secondary structure formation Proline disrupts helical structure because it has no hydrogen on its N terminus prohibiting it from
View Full Document
Unlocking...