PowerPoint PresentationTransformationsSlide 3Distance FunctionsSuperposition - best least squares (RMSD – Root Mean Square Deviation)Correspondence is UnknownA 3-D reference frame can be uniquely defined by the ordered vertices of a non-degenerate triangleSequence Based Structure AlignmentStructure Alignment (Straightforward Algorithm)Slide 10Structural Alignment ApproachesSlide 12Slide 13Slide 14Slide 154-helix bundleSequence Order Independent AlignmentSlide 18Slide 19TRAF-Immunoglobulin EnsembleSlide 21Protein Structure AlignmentHuman Myoglobin pdb:2mm1Human Hemoglobin alpha-chain pdb:1jebASequence id: 27%Structural id: 90%Another example:G-Proteins: 1c1y:A, 1kk1:A6-200Sequence id: 18%Structural id: 72%TransformationsTranslationTranslation and Rotation Rigid Motion (Euclidian Trans.)Translation, Rotation + Scalingtxxrrr+=''x Rx t= +rr r)(' txRsxrrr+=Inexact Alignment. Simple case – two closely related proteins with the same number of amino acids.TQuestion: how to measure an alignment error?Distance FunctionsTwo point sets: A={ai} i=1…n B={bj} j=1…m•Pairwise Correspondence: (ak1,bt1) (ak2,bt2)… (akN,btN)(1) Exact Matching: ||aki – bti||=0 (2) Bottleneck max ||aki – bti||(3) RMSD (Root Mean Square Distance) Sqrt( Σ||aki – bti||2/N)Superposition - best least squares(RMSD – Root Mean Square Deviation)Given two sets of 3-D points :P={pi}, Q={qi} , i=1,…,n;rmsd(P,Q) = √ i|pi - qi |2 /nFind a 3-D rigid transformation T* such that:rmsd( T*(P), Q ) = minT √ i|T(pi) - qi |2 /nA closed form solution exists for this task.It can be computed in O(n) time.Correspondence is Unknownfind those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding 3-D points.Given two configurations of points in the three dimensional space,TA 3-D reference frame can be uniquely defined by the ordered vertices of a non-degenerate trianglep1p2p3Sequence Based Structure Alignment•Run pairwise sequence alignment.•Based on sequence correspondence compute 3D transformation (least square fit can be applied).•Iteratively improve structural superposition.Not a good approach – sequence alignment can be incorrect.Structure Alignment (Straightforward Algorithm)•For each pair of triplets, one from each molecule which define ‘almost’ congruent triangles compute the rigid transformation that superimposes them.•Count the number of aligned point pairs and sort the hypotheses by this number.•For the highest ranking hypotheses improve the transformation by replacing it by the best RMSD transformation for all the matching pairs.•Complexity : O(n3m3 ) * O(nm) . Applying 3D grid gives practically O(n3m3) * O(n)•If one exploits protein backbone geometry + 3D grid : O(nm) * O(n)Structural Alignment Approaches 1. Generate a set of 3D transformations.2. Compute 3D alignment for each transformation.Two interrelated problems: 3D transformation and point correspondence (matching, alignment)1. Generate a set of 3D transformations.2. Cluster similar transformations.3. Compute 3D alignment for each cluster representative.Geometric Hashing:Combines transformation and correspondence detection in one scheme.Some methods:Accuracy improvement during detection of 3D transformation.Instead of 3 points use more. How many?Align any possible pair of fragments - Fij(k) iji+k-1j+k-1Accept Fij(k) if rmsd(Fij(k)) <Complexity O(n3 n) * O(n) (assume n~m)(For each Fij(k) we need compute its rmsd)can be reduced to O(n3) * O(n)Improvement : BLAST idea - detect short similar fragments, then extend as much as possible.ji+1j+1ij-1i-1ai-1 ai ai+1bj-1 bj bj+1ktk+l-1t+l-1Complexity: O(n2)*O(n)Extend while: rmsd(Fij(k)) <Sequence-order Independent AlignmentP: Q:4-helix bundle2cbl:A1f4n:A1b3q1rhg:ASequence Order Independent AlignmentSequence Order Independent Alignment2cbl:A1f4n1rhg:A1b3q51 103 113 1693 58 54 773 1263412306 355 354 305171 147chain Achain Achain Bchain BE. A. NALEFSKI and J. J. FALKE The C2 domain calcium-binding motif: Structural and functional diversity Protein Sci 1996 5: 2375-2390The C2 domain calcium-binding motifTRAF-Immunoglobulin Ensemble- helices ; - strandsEnsemble: 8 proteins from 2 folds.Core: sandwich of 6 strandsRuntime: 21 secondsE- strand•Rasmol – Molecular Visualization •SCOP - Structural Classification of Proteins •MultiProt - Protein Structural (pairwise/multiple) Alignment•MASS – Secondary Structure Based (pairwise/multiple) AlignmentSome
View Full Document