CORNELL CS 726 - Identification of Domains using Structural Data - D2390291

Home> Schools> Cornell University> Computer Science (CS) > CS 726> Identification of Domains using Structural Data

CORNELL CS 726 - Identification of Domains using Structural Data

Course Cs 726- Problems and perspective in computational molecular biology

Pages 10

Download Save

Unformatted text preview:

Identification of Domains using Structural DataAssorted Definitions of DomainsProtein Structural Domain IdentificationBasic AlgorithmUpdate FormulaExampleRefinementsPreserving -sheetsSelf-testing with fake homologsExtension to Multiple StructuresIdentification of Domains using Structural DataNiranjan NagarajanDepartment of Computer ScienceCornell UniversityAssorted Definitions of Domains•Subsequences that can fold independently into a stable structure.•Structurally compact substructures.•Functionally well-defined building blocks.•Evolutionarily conserved and reused fragments.Protein Structural Domain IdentificationWilliam R. TaylorBasic Algorithm•Initial Assignment of Labels–Sequential residue numbering•Update of Labels•Termination Condition–Mean squared deviation of average between successive cycles < 10^-6 or number of iterations > (length of protein)/2Update Formula•Sit+1 = Sit + step(t+1)*sign(jf(Sit, Sjt)) i.•sign(x) = 1 if x > 0, -1 if x < 0, 0 if x = 0.•f(Sit, Sjt) = –r/dij if Sjt > Sit and dij < r.–-r/dij if Sjt < Sit and dij < r.–0 otherwise.•Step(x) = –1 if x < N/2. –2(N-x)/N if N/2 <= x < N. –0 otherwise.Example•Full lines indicate protein backbone.•Neighboring residues within radius r are connected by dashed lines. •Connections between i and i + 2 have been omitted for clarity.•Label evolution is done without inverse distance weighting.Refinements•Median based smoothing with a window size of 21 to reclaim short loops of 10 or less residues.•Small domains reassigned by using the weighted mean values of its neighbors (weights are given using f.) •Domain recalculation repeated for at most five times.Preserving -sheets•Matrix B of possible -sheet interactions between residues generated based on distance data and heuristics.•Weighted mean heuristic used to generate initial assignment of labels with the averaging being iterated to convergence.•Post-processing also done to badly broken -sheets.Self-testing with fake homologs•Fake homologs generated by smoothing–Replacing central atom of triple by average.–Process repeated five times.•Domain assignments compared and similarity evaluated based on overlap score.•r optimized for best overlap score.Extension to Multiple Structures•Algorithm is simultaneously run on structures corresponding to a multiple sequence alignment.•Labels are synchronized to the average of the labels at a position after each

View Full Document

CORNELL CS 726 - Identification of Domains using Structural Data

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

CORNELL CS 726 - Identification of Domains using Structural Data

Sign up for free to view:

Please select your school