Computational Biology, Part 10 Predicting from Protein SequenceStarting PointSlide 3Use of amino acid properties in prediction schemesHydro-pathy/phobicity/philicityHydro-pathy/phobicity/philicity AnalysisHydrophobicity/Hydrophilicity TablesKyte-Doolittle hydropathyBasic Hydropathy/Hydrophilicity PlotExample Hydrophilicity PlotAmphiphilicity/AmphipathicitySlide 12Helical Wheel for Prion ProteinHydrophobic MomentPowerPoint PresentationPrediction MethodsMachine Learning 102Slide 18Slide 19Example Confusion Matrix for Three ClassesExample Confusion Matrix for Three Classes with “Unknown” allowed as a PredictionGoalStructural PropensitiesSlide 24Secondary structure predictionChou-Fasman methodChou-Fasman propensities (partial table)Slide 28Slide 29Accuracy of Chou-Fasman predictionsConfusion matrix for Chou-Fasman method on 78 proteinsGarnier-Osguthorpe-RobsonSlide 33Confusion matrix for GOR method on 78 proteinsAccuracy of predictionsSlide 36Neural NetworksNeural Network methodsSlide 39Homology-based modelingSlide 411Computational Biology, Part 10Predicting from Protein SequenceComputational Biology, Part 10Predicting from Protein SequenceRobert F. MurphyRobert F. MurphyCopyright Copyright 1996, 1999-2008 1996, 1999-2008All rights reserved.All rights reserved.2Starting PointStarting PointBroad Goal: To determine or predict as much Broad Goal: To determine or predict as much as we can from a “new” protein sequenceas we can from a “new” protein sequenceHave covered how to find Have covered how to find protein motifsprotein motifs such such as targets for post-translational modification as targets for post-translational modification (Profiles/PSSMs/HMMs)(Profiles/PSSMs/HMMs)Have covered how to find Have covered how to find homologous homologous proteinsproteins - we will need them perhaps to - we will need them perhaps to predict something from predict something from theirtheir properties properties3Starting PointStarting PointSome properties or “propensities” can be Some properties or “propensities” can be directly calculated from individual amino directly calculated from individual amino acidsacidsThese properties are useful in themselves These properties are useful in themselves and may also be used in place of the and may also be used in place of the original sequence for some prediction original sequence for some prediction methods (or in addition to sequence)methods (or in addition to sequence)4Use of amino acid properties in prediction schemesUse of amino acid properties in prediction schemesPrediction functionSequenceOther inputsPredictionPropensity functionSequenceOther inputsVector of propensities5Hydro-pathy/phobicity/philicityHydro-pathy/phobicity/philicityOne of the most commonly used properties is One of the most commonly used properties is the suitability of an amino acid for an the suitability of an amino acid for an aqueous environmentaqueous environmentHydropathy & HydrophobicityHydropathy & Hydrophobicitydegree to which something is “water hating” or degree to which something is “water hating” or “water fearing”“water fearing”HydrophilicityHydrophilicitydegree to which something is “water loving”degree to which something is “water loving”6Hydro-pathy/phobicity/philicity AnalysisHydro-pathy/phobicity/philicity AnalysisGoal: Obtain quantitative descriptions of the Goal: Obtain quantitative descriptions of the degree to which regions of a protein are degree to which regions of a protein are likely to be exposed to aqueous solventslikely to be exposed to aqueous solventsStarting point: Tables of propensities of Starting point: Tables of propensities of each amino acideach amino acid7Hydrophobicity/Hydrophilicity TablesHydrophobicity/Hydrophilicity TablesDescribe the likelihood that each amino acid Describe the likelihood that each amino acid will be found in an aqueous environment - will be found in an aqueous environment - one value for each amino acidone value for each amino acidCommonly used tablesCommonly used tablesKyte-DoolittleKyte-Doolittle hydropathy hydropathyHopp-WoodsHopp-Woods hydrophilicity hydrophilicityEisenberg et al. Eisenberg et al. normalized consensus normalized consensus hydrophobicityhydrophobicity8Kyte-Doolittle hydropathyKyte-Doolittle hydropathyAminoAcidIndex AminoAcidIndexR -4.5 S -0.8K -3.9 T -0.7D -3.5 G -0.4Q -3.5 A 1.8N -3.5 M 1.9E -3.5 C 2.5H -3.2 F 2.8P -1.6 L 3.8Y -1.3 V 4.2W -0.9 I 4.59Basic Hydropathy/Hydrophilicity PlotBasic Hydropathy/Hydrophilicity PlotCalculate average hydropathy over a Calculate average hydropathy over a windowwindow (e.g., 7 amino acids) and slide (e.g., 7 amino acids) and slide window until entire sequence has been window until entire sequence has been analyzedanalyzedPlot average for each window versus Plot average for each window versus position of window in sequenceposition of window in sequence10Example Hydrophilicity PlotExample Hydrophilicity PlotThis plot is for a tubulin, a soluble cytoplasmic protein. Regions with high hydrophilicity are likely to be exposed to the solvent (cytoplasm), while those with low hydrophilicity are likely to be internal or interacting with other proteins.11Amphiphilicity/AmphipathicityAmphiphilicity/AmphipathicityA structural domain of a protein (e.g., an A structural domain of a protein (e.g., an -helix) -helix) can be present at an interface between polar and can be present at an interface between polar and non-polar environmentsnon-polar environmentsExample: Domain of a Example: Domain of a membrane-associated proteinmembrane-associated protein that anchors it to membranethat anchors it to membraneSuch a domain will ideally be hydrophilic on one Such a domain will ideally be hydrophilic on one side and hydrophobic on the otherside and hydrophobic on the otherThis is termed an This is termed an amphiphilicamphiphilic or or amphipathicamphipathic sequence or domainsequence or domain12Amphiphilicity/AmphipathicityAmphiphilicity/AmphipathicityTo find such sequences, we look for regions To find such sequences, we look for regions where short stretches of charged residues where short stretches of charged residues alternate with short stretches of alternate with short stretches of hydrophobic residues hydrophobic residues with a repeat distance with a repeat distance corresponding to the period of the structurecorresponding to the period of
View Full Document