© Doug Brutlag, 1999© Doug Brutlag, 1999BionformaticsBionformatics: Function from Sequence: Function from Sequence(http://cmgm.stanford.edu/biochem201/Handouts/june2.html)(http://cmgm.stanford.edu/biochem201/Handouts/june2.html)Doug BrutlagDoug BrutlagDepartments of BiochemistryDepartments of BiochemistryJune 2, 1999June 2, 1999Learning Function from Sequence© Doug Brutlag, 1999© Doug Brutlag, 1999Bioinformatics, Genomics andBioinformatics, Genomics andBiomedical ResearchBiomedical ResearchBioinformaticsBioinformaticsGenomicsGenomicsIdentify Drug TargetsIdentify Drug TargetsRational Drug DesignRational Drug DesignMolecular DiagnosticsMolecular DiagnosticsMolecular EpidemiologyMolecular EpidemiologyGenetic TherapyGenetic TherapyDatabasesDatabasesMachine LearningMachine LearningRoboticsRoboticsStatistics & ProbabilityStatistics & ProbabilityArtificial IntelligenceArtificial IntelligenceInformation TheoryInformation TheoryGraph TheoryGraph TheoryAlgorithmsAlgorithms© Doug Brutlag, 1999© Doug Brutlag, 1999Central Paradigm of Molecular BiologyCentral Paradigm of Molecular Biology••MoleculesMolecules••StructureStructure••FunctionFunction••ProcessesProcesses••MechanismMechanism••SpecificitySpecificity••RegulationRegulationDNADNARNARNAProteinProteinPhenotypePhenotype(Symptoms)(Symptoms)© Doug Brutlag, 1999© Doug Brutlag, 1999Central Paradigm of BioinformaticsCentral Paradigm of BioinformaticsGeneticGeneticInformationInformationMolecularMolecularStructureStructureBiochemicalBiochemicalFunctionFunctionSymptomsSymptoms(Phenotype)(Phenotype)SRAAINKHIVASRAAINKHIVAVSYQTVSRVVNVSYQTVSRVVNVSTATVSRALAVSTATVSRALAGVTTTVSHVINGVTTTVSHVINSGVSAVSAILNSGVSAVSAILNGVSEMTRRDLNGVSEMTRRDLNTAYATIHVRVETAYATIHVRVEGSQPTVSRELAGSQPTVSRELAMSIATITRGSNMSIATITRGSNISRETVGRILKISRETVGRILKFDISRLSHLFRFDISRLSHLFRLRPSRLAHLFRLRPSRLAHLFRMTVETISRLLGMTVETISRLLGTLEFHLHRLFKTLEFHLHRLFK© Doug Brutlag, 1999© Doug Brutlag, 1999Central Paradigm of BioinformaticsCentral Paradigm of BioinformaticsGeneticGeneticInformationInformationMolecularMolecularStructureStructureBiochemicalBiochemicalFunctionFunctionSymptomsSymptoms(Phenotype)(Phenotype)SRAAINKHIVASRAAINKHIVAVSYQTVSRVVNVSYQTVSRVVNVSTATVSRALAVSTATVSRALAGVTTTVSHVINGVTTTVSHVINSGVSAVSAILNSGVSAVSAILNGVSEMTRRDLNGVSEMTRRDLNTAYATIHVRVETAYATIHVRVEGSQPTVSRELAGSQPTVSRELAMSIATITRGSNMSIATITRGSNISRETVGRILKISRETVGRILKFDISRLSHLFRFDISRLSHLFRLRPSRLAHLFRLRPSRLAHLFRMTVETISRLLGMTVETISRLLGTLEFHLHRLFKTLEFHLHRLFK© Doug Brutlag, 1999© Doug Brutlag, 1999Discovering Function from Protein SequenceDiscovering Function from Protein SequenceDiscovering Function from Protein SequenceConsensus SequencesConsensus SequencesConsensus SequencesZinc Finger (C2H2 type)Zinc Finger (C2H2 type)Zinc Finger (C2H2 type)Cx{2,4}Cx{12}Hx{3,5}HCx{2,4}Cx{12}Hx{3,5}HCx{2,4}Cx{12}Hx{3,5}HSequence AlignmentsSequence AlignmentsSequence Alignments 10 20 30 40 50 10 20 30 40 50 10 20 30 40 501 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS1 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS1 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS |:| :|: | |:|||| | |:||| |: : :|:| :| | |: | |:| :|: | |:|||| | |:||| |: : :|:| :| | |: | |:| :|: | |:|||| | |:||| |: : :|:| :| | |: |2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN 10 20 30 40 50 10 20 30 40 50 10 20 30 40 50Sequences of CommonStructure or FunctionSequences of CommonSequences of CommonStructure or FunctionStructure or FunctionProfiles, PSI-BLASTHidden Markov ModelsProfiles, PSI-BLASTProfiles, PSI-BLASTHidden Markov ModelsHidden Markov ModelsAA1AA1AA2AA2AA3AA3AA4AA4AA5AA5AA6AA6I 1I 1I 2I 2I 3I 3I 4I 4I 5I 5D 2D 2D 3D 3D 4D 4D 5D 5PositionPositionPosition 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12AAA 2 1 3 13 10 12 67 4 13 9 1 2 2 1 3 13 10 12 67 4 13 9 1 2 2 1 3 13 10 12 67 4 13 9 1 2RRR 7 5 8 9 4 0 1 16 7 0 1 0 7 5 8 9 4 0 1 16 7 0 1 0 7 5 8 9 4 0 1 16 7 0 1 0NNN 0 8 0 1 0 0 0 2 1 1 10 0 0 8 0 1 0 0 0 2 1 1 10 0 0 8 0 1 0 0 0 2 1 1 10 0DDD 0 1 0 1 13 0 0 12 1 0 4 0 0 1 0 1 13 0 0 12 1 0 4 0 0 1 0 1 13 0 0 12 1 0 4 0CCC 0 0 1 0 0 0 0 0 0 2 2 1 0 0 1 0 0 0 0 0 0 2 2 1 0 0 1 0 0 0 0 0 0 2 2 1QQQ 1 1 21 8 10 0 0 7 6 0 0 2 1 1 21 8 10 0 0 7 6 0 0 2 1 1 21 8 10 0 0 7 6 0 0 2EEE 2 0 0 9 21 0 0 15 7 3 3 0 2 0 0 9 21 0 0 15 7 3 3 0 2 0 0 9 21 0 0 15 7 3 3 0GGG 9 7 1 4 0 0 8 0 0 0 46 0 9 7 1 4 0 0 8 0 0 0 46 0 9 7 1 4 0 0 8 0 0 0 46 0HHH 4 3 1 1 2 0 0 2 2 0 5 0 4 3 1 1 2 0 0 2 2 0 5 0 4 3 1 1 2 0 0 2 2 0 5 0III 10 0 11 1 2 10 0 4 9 3 0 16 10 0 11 1 2 10 0 4 9 3 0 16 10 0 11 1 2 10 0 4 9 3 0 16LLL 16 1 17 0 1 31 0 3 11 24 0 14 16 1 17 0 1 31 0 3 11 24 0 14 16 1 17 0 1 31 0 3 11 24 0 14KKK 3 4 5 10 11 1 1 13 10 0 5 2 3 4 5 10 11 1 1 13 10 0 5 2 3 4 5 10 11 1 1 13 10 0 5 2MMM 7 1 1 0 0 0 0 0 5 7 1 8 7 1 1 0 0 0 0 0 5 7 1 8 7 1 1 0 0 0 0 0 5 7 1 8FFF 4 0 3 0 0 4 0 0 0 10 0 0 4 0 3 0 0 4 0 0 0 10 0 0 4 0 3 0 0 4 0 0 0 10 0 0PPP 0 6 0 1 0 0 0 0 0 0 0 0 0 6 0 1 0 0 0 0 0 0 0 0 0 6 0 1 0 0 0 0 0 0 0 0SSS 1 17 0 8 3 1 3 0 2 2 2 0 1 17 0 8 3 1 3 0 2 2 2 0 1 17 0 8 3 1 3 0 2 2 2 0TTT 5 22 3 11 1 5 0 2 2 2 0 5 5 22 3 11 1 5 0 2 2 2 0 5 5 22 3 11 1 5 0 2 2 2 0 5WWW 2 0 0 0 0 0 0 0 0 1 0 1 2 0 0 0 0 0 0 0 0 1 0 1 2 0 0 0 0 0 0 0 0 1 0 1YYY 1 0 4 2 0 1 0 0 2 4 0 1 1 0 4 2 0 1 0 0 2 4 0 1 1 0 4 2 0 1 0 0 2 4 0 1VVV 6 3 1 1 2 15 0 0 2 12 0 28 6 3 1 1 2 15 0 0 2 12 0 28 6 3 1 1 2 15 0 0 2 12 0 28BLOCK,
View Full Document