UMD CMSC 838T - Characterizing the microenvironment surrounding protein sites

Unformatted text preview:

Profein Science (1995), 4:622635. Cambridge University Press. Printed in the USA. Copyright 0 1995 The Protein Society ~~ ~ ~~ ~ ~ ~~~ Characterizing the microenvironment surrounding protein sites STEVEN C. BAGLEY AND RUSS B. ALTMAN Section on MedicaI Informatics, Stanford University School of Medicine, MSOB X-215, Stanford, California 94305-5479 (RECEIVED November 4, 1994; ACCEPTED January 23, 1995) Abstract Sites are microenvironments within a biomolecular structure, distinguished by their structural or functional role. A site can be defined by a three-dimensional location and a local neighborhood around this location in which the structure or function exists. We have developed a computer system to facilitate structural analysis (both qual- itative and quantitative) of biomolecular sites. Our system automatically examines the spatial distributions of bio- physical and biochemical properties, and reports those regions within a site where the distribution of these properties differs significantly from control nonsites. The properties range from simple atom-based characteristics such as charge to polypeptide-based characteristics such as type of secondary structure. Our analysis of sites uses non- sites as controls, providing a baseline for the quantitative assessment of the significance of the features that are uncovered. In this paper, we use radial distributions of properties to study three well-known sites (the binding sites for calcium, the milieu of disulfide bridges, and the serine protease active site). We demonstrate that the sys- tem automatically finds many of the previously described features of these sites and augments these features with some new details. In some cases, we cannot confirm the statistical significance of previously reported features. Our results demonstrate that analysis of protein structure is sensitive to assumptions about background distribu- tions, and that these distributions should be considered explicitly during structural analyses. Keywords: biobhysical properties; calcium binding; computational biology; disulfide bridges; microenvironment; protein structure analysis; serine proteases; software Central to molecular biology is the determination of macro- molecular structure and the analysis of how structural elements produce an observed function. The principles by which structure relates to function have been elucidated in a piecemeal fashion, from work on single structures or small classes of structures. Computational assistance has come primarily in the form of graphical methods for scientific.visualization and from special purpose programs for analyzing individual biophysical prop- erties (such as solvent accessibility or electrostatic fields). Un- fortunately, studying stnktures individually entails a risk of missing important relationships that would be revealed by pool- ing relevant data. The expected surfeit of protein structures provides an opportunity to develop tools for automatically ex- amining biologica1 structures and producing useful represen- tations of the key biophysical and biochemical features. The utility of a general purpose system for producing these repre- sentations would extend from rnedicaI/pharmaceutical applica- tions (model-based drug design, comparing pharmacological Reprint requests to: Russ B. Alcman, Section on Medical Informat- ics, Stanford University School of Medicine. MSOB X-215. Stanford, California 94305-5479; e-mail: altman~camis.stanford.edu. activities) to industrial applications (understanding structural stability, protein engineering). In this paper we describe a computational tool for analyzing protein sites - microenvironments within astructure distinguished by their structural or functional roles. We define a site as a re- gion within a macromolecule with a central location and a sur- rounding neighborhood. In principal, a site could include the entire molecule, but we focus on sites that involve proper sub- sets of the molecule using a neighborhood with a 10-A radius. Sites can be significant because of their structural role (for ex- ample, the site where a disulfide bond forms), their functional role (the active site of a serine protease) or both (the site of cal- cium binding). The most basic representation of a site is the set of atoms within it, along with their three-dimensional coordi- nates. We have created a system that augments this represen- tation with the spatial distribution of.user-defined properties. These properties include labels designating the types of atoms, chemical groups, amino acids, and secondary structures. They also include simple biophysical characteristics such as charge, polarity, mobility, and solvent accessibility. The distribution of a property is computed by dividing the total volume of a site into subvolumes and computing the prev- 622Churucterizing microenvironments in proteins 623 alence of the property within each of these subvolumes. Such distributions can be computed for sites, as well as for other mi- croenvironments that are taken as nonsites. We have built the system on the assumptions that the key features of a microenvi- ronment are defined with respect to a background distribution, and that the background distribution should be derived from the data, not from prior assumptions (such as spatial uniformity). The system therefore compares the distribution of the proper- ties in sites (the positive examples) with the distribution of the same properties in user-specified nonsites (the negative exam- ples, used as controls). Properties for which the site and non- site distributions are different to a statistically significant degree are reported. These statistically interesting properties should be considered preliminary hypotheses that allow an investigator to focus attention on regions that may be responsible for the par- ticular structure or function of interest. It may also find use in the testing and verification of predictions. in this implementation, we have concentrated only on spher- ically symmetric, radial distributions (whereby the volume of a site is divided into concentric shell subvolumes) for three rea- sons.


View Full Document

UMD CMSC 838T - Characterizing the microenvironment surrounding protein sites

Documents in this Course
Load more
Download Characterizing the microenvironment surrounding protein sites
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Characterizing the microenvironment surrounding protein sites and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Characterizing the microenvironment surrounding protein sites 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?