New version page

UMD CMSC 838T - Efficient detection of three-dimensional structural motifs

Documents in this Course
Load more
Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

Proc. Null. Acad. Sci. USA Vol. 88, pp. 10495-10499. December 1991 Biophysics Efficient detection of three-dimensional structural motifs in biological -macromolecules by computer vision techniques (three-dimensional structural comparison/crystallographic coordinates/eflicienl computer vision algorithm/macromol~ular Structure analysis) RUTH NUSSINOV*+ AND HAIM J. WOLF SON^^ *Sackler lnstitute of Molecular Medicine. Faculty of Medicine and $Computer Science Department. School of Mathematical Sciences. Tel Aviv University. Tel Aviv 69978 Israel: ‘Laboratory of Mathematical Biology. National Cancer Institute. National Institutes of Health, Frederick Cancer Research Facility. Building 469. Rwm 151. Frederick. MD 21702: and SRobotics Research Laboratory. Courant Institute of Mathematical Sciences. New York University. 715 Broadway. 12th Floor. New York. NY 10003 Communicured hy Jacob 7. Schrurr:. July 29, 1991 (received for reviett, Fehruu~ 1990) ABSTRACT Macromolecules carrying biological informa- tion often consist of independent modules containing recurring structural motifs. Detection of a specific structural motif within a protein (or DNA) aids in elucidating the role played by the protein (DNA element) and the mechanism of its operation. The number of crystaliographically known structures at high res- olution is increasing very rapidly. Yet, comparison of three- dimensional structures is a laborious time-consuming proce- dure that typically requires a manual phase. To date, there is no fast automated procedure for structural comparisons. We present an eflicient 0(n3) worst case time complexity algorithm for achieving such a goai (where n is the number of atoms in the examined structure). The method is truly three-dimensional, sequence-order-independent, and thus insensitive to gaps, in- sertions, or deletions. This algorithm is based on the geometric hashing paradigm, which was originally developed for object recognition problems in computer vision. it introduces an indexing approach based on transformation invariant repre- sentations and is especially geared toward efficient recognition of partial structures’ih rigid objects belonging to large data bases. This algorithm is suitable for quick scanning of struc- tural data bases and will detect a recurring structural motif that is u priori unknown. The algorithm uses protein (or DNA) structures, atomic labels, and their three-dimensional coordi- nates. Additional information pertaining to the structure speeds the comparisons. The algorithm is straightforwardly padelizable, and several versions of it for computer vision applications have been implemented on the massively parallel connection machine. A prototype version of the algorithm has been implemented and applied to the detection of substructures in proteins. One of the basic emerging principles in molecular biology is the modular nature of DNA sequence elements and of the corresponding sequence-specific protein factors recognizing them. The domains appear to be independent units (I). Structural and functional studies of these domains have demonstrated the existence of several structural motifs. The motifs include the helix-tum-helix (HTH) (21, zinc fingers (3), homeodomain (4). leucine zipper (9, helix-loophelix (61, Ser-Pro-Lys-Lys histone (7), proline-rich (8) and glu- tamine-rich (9) motifs, the antiparallel p-sheet (10) apparently inserted in the minor groove. and more recently a pair of &strands in the major groove of the DNA (11). All of these motifs typically include less than 100 amino acid residues. Finding a given structural motif in a protein may clearly aid in understanding its role (12). The latter is inferred by analogy with other proteins containing the motif. Structural compar- The publication costs of this article were defrayed in part by page charge payment. This article must thereforc be hereby marked “udlvrri.ccrnenr” in accordance wi!h 18 U.S.C. $1734 solely to indicate this fact. isons are thus central to molecular biology. The problem we are faced with is to devise eficient techniques for routine scanning of structural data bases and searching for recur- rences of inexact structural motifs. The degree of allowed errors is to be determined by the user. The most commonly used computerized macromolecule comparison approaches deal mainly with Comparison of the primary structure of molecules. They are based on character string comparison algorithms. most of which use variations of the dynamic programming technique (for a good survey, see ref. 13). Structural comparison is superior to this primary sequence analysis, since it takes into account the spatial geometric structure of the molecules involved and not only their order on the primary chain. The increasing need for direct structural analysis of macromolecules has led to the development of several computerized methods (14-16), These methods, however. look for predefined motifs in the secondary structure of the macromolecule. Moreover. these motifs are usually composed of contiguous amino acids on the primary chain, such as a-helices or p-sheets. The method that we develop enables elucidating similar substructures in different molecules without specifying in advance :vhat these structures should be. Moreover. the motifs do not necessarily involve contiguous amino acids. so the approach is truly three dimensional (3D). This enables detection of various structural patterns. Currently. true 3D structural comparisons are carried out mainly using interactive computer graphics and visualization facilities. The programs compare the locations of every pair of corresponding atoms in any two specific structures. Al- though useful, this tool falls short ofwhat is needed. Since the computer graphic programs compare either two complete (crystal or computed) structures or any user-specified sub- sections, they are excellent for individual protein or nucleic acid analysis but are very time consuming for extensive comparisons. From a mathematical standpoint, the structural compari- son problem between two molecules can be formulated as follows. Given the 3D coordinates of

View Full Document
Download Efficient detection of three-dimensional structural motifs
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view Efficient detection of three-dimensional structural motifs and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Efficient detection of three-dimensional structural motifs 2 2 and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?