Chapter 9MotivationFirst StepArtificial Neural NetworkDangerProfile network from HeiDelbergPHDThreading3D-1D Matching (Bowie et al.)3D-1DMethods using 3D interactions3D interactionsPotentials of mean force (POMF)Multiple Sequence ThreadingExampleSequence-Structure AlignmentEvaluating MethodsChapter 9Chapter 9Structure PredictionStructure PredictionMotivationMotivationGiven a protein, can you predict molecular structureWant to avoid repeated x-ray crystallography, but want accuracyYou could use nucleotide alignment, but what do you do with the gapped regions?More complex methods are only justified if they can be shown to perform better than simpler methodsSimpler methods are only justified if they can perform better than basic sequence alignmentGiven a protein, can you predict molecular structureWant to avoid repeated x-ray crystallography, but want accuracyYou could use nucleotide alignment, but what do you do with the gapped regions?More complex methods are only justified if they can be shown to perform better than simpler methodsSimpler methods are only justified if they can perform better than basic sequence alignmentFirst StepFirst StepSome structure comparison methods use secondary structures of the new sequencePredict location of secondary structure elements along the protein’s backbone and the degree of residue burialSupervised learning has been shown to perform well in this taskSome structure comparison methods use secondary structures of the new sequencePredict location of secondary structure elements along the protein’s backbone and the degree of residue burialSupervised learning has been shown to perform well in this taskArtificial Neural NetworkArtificial Neural NetworkPredictsStructure at this pointPredictsStructure at this pointDangerDangerYou may train the network on your training set, but it may not generalize to other dataPerhaps we should train several ANNs and then let them vote on the structureYou may train the network on your training set, but it may not generalize to other dataPerhaps we should train several ANNs and then let them vote on the structureProfile network from HeiDelbergProfile network from HeiDelbergfamily (alignment is used as input) instead of just the new sequenceOn the first level, a window of length 13 around the residue is used The window slides down the sequence, making a prediction for each residueThe input includes the frequency of amino acids occurring in each position in the multiple alignment (In the example, there are 5 sequences in the multiple alignment)The second level takes these predictions from neural networks that are centered on neighboring proteins The third level does a jury selectionfamily (alignment is used as input) instead of just the new sequenceOn the first level, a window of length 13 around the residue is used The window slides down the sequence, making a prediction for each residueThe input includes the frequency of amino acids occurring in each position in the multiple alignment (In the example, there are 5 sequences in the multiple alignment)The second level takes these predictions from neural networks that are centered on neighboring proteins The third level does a jury selectionPHDPHDPredicts 4Predicts 4Predicts 6Predicts 6Predicts 5Predicts 5ThreadingThreadingThreading matches structure to sequenceTrue threading considers 3D spatial interactionsThreading matches structure to sequenceTrue threading considers 3D spatial interactions3D-1D Matching (Bowie et al.)3D-1D Matching (Bowie et al.)Convert 3D structure into a stringInclude -helix, -sheet or neitherInclude buried or solvent accessible (6 levels) Total of 3X6=18 distinct statesWith Pa:j= probability of finding amino acid (a) in environment (j) and Pa=probability of finding (a) anywhereConvert 3D structure into a stringInclude -helix, -sheet or neitherInclude buried or solvent accessible (6 levels) Total of 3X6=18 distinct statesWith Pa:j= probability of finding amino acid (a) in environment (j) and Pa=probability of finding (a) anywhere€ saj= logPa: jPa ⎛ ⎝ ⎜ ⎞ ⎠ ⎟3D-1D3D-1DCalculate the information values score on a training set of multiple alignments and the score was used as a profile for each columnWhen applied to the globin family an clearly identified myoglobins from nonglobins but not from other globinsCalculate the information values score on a training set of multiple alignments and the score was used as a profile for each columnWhen applied to the globin family an clearly identified myoglobins from nonglobins but not from other globinsMethods using 3D interactionsMethods using 3D interactionsResidues that have large separation in the sequence may end up next to each other when the protein is folded.Define a measure of contact between residues (two atoms within 5Å) and count frequency of contact between all pairs in PDBUse measure in alignment to evaluate cost, or to select the best alignment Residues that have large separation in the sequence may end up next to each other when the protein is folded.Define a measure of contact between residues (two atoms within 5Å) and count frequency of contact between all pairs in PDBUse measure in alignment to evaluate cost, or to select the best alignment3D interactions3D interactionsQuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.Potentials of mean force (POMF)Potentials of mean force (POMF)Since the notion of contact is somewhat arbitrary, a more general formulation can be triedDerive an empirical function for the propensity of each of the 400 pairs of residues to be any given distance apart.Since the notion of contact is somewhat arbitrary, a more general formulation can be triedDerive an empirical function for the propensity of each of the 400 pairs of residues to be any given distance apart.Multiple Sequence ThreadingMultiple Sequence ThreadingMultiple Sequence AlignmentAlign the most similar to create a consensus sequenceAlign consensus sequences to create overall alignmentUse the same strategy with structuresAssume that conserved hydrophobic positions should pack in the coreThis appears to be work in progress (1997)Multiple Sequence AlignmentAlign the most similar to create a consensus sequenceAlign consensus sequences to create overall
View Full Document