© Doug Brutlag, 1999© Doug Brutlag, 1999Biochemistry 201Biochemistry 201Advanced Molecular BiologyAdvanced Molecular Biology(http://(http://cmgmcmgm.stanford.edu/biochem201/).stanford.edu/biochem201/)Doug BrutlagDoug BrutlagDepartments of BiochemistryDepartments of BiochemistryJune 4, 1999June 4, 1999Bioinformatics:Bioinformatics:Discovering Function from SequenceDiscovering Function from Sequence© Doug Brutlag, 1999© Doug Brutlag, 1999Discovering Function from Protein SequenceDiscovering Function from Protein SequenceDiscovering Function from Protein SequenceConsensus SequencesConsensus SequencesConsensus SequencesZinc Finger (C2H2 type)Zinc Finger (C2H2 type)Zinc Finger (C2H2 type)C.{2,4} C.{12} H.{3,5} HC.{2,4} C.{12} H.{3,5} HC.{2,4} C.{12} H.{3,5} HSequence AlignmentsSequence AlignmentsSequence Alignments 10 20 30 40 50 10 20 30 40 50 10 20 30 40 501 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS1 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS1 VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS |:| :|: | |:|||| | |:||| |: : :|:| :| | |: | |:| :|: | |:|||| | |:||| |: : :|:| :| | |: | |:| :|: | |:|||| | |:||| |: : :|:| :| | |: |2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN 10 20 30 40 50 10 20 30 40 50 10 20 30 40 50Sequences of CommonStructure or FunctionSequences of CommonSequences of CommonStructure or FunctionStructure or FunctionProfiles, PSI-BLASTHidden Markov ModelsProfiles, PSI-BLASTProfiles, PSI-BLASTHidden Markov ModelsHidden Markov ModelsAA1AA1AA2AA2AA3AA3AA4AA4AA5AA5AA6AA6I 1I 1I 2I 2I 3I 3I 4I 4I 5I 5D 2D 2D 3D 3D 4D 4D 5D 5PositionPositionPosition 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12AAA 2 1 3 13 10 12 67 4 13 9 1 2 2 1 3 13 10 12 67 4 13 9 1 2 2 1 3 13 10 12 67 4 13 9 1 2RRR 7 5 8 9 4 0 1 16 7 0 1 0 7 5 8 9 4 0 1 16 7 0 1 0 7 5 8 9 4 0 1 16 7 0 1 0NNN 0 8 0 1 0 0 0 2 1 1 10 0 0 8 0 1 0 0 0 2 1 1 10 0 0 8 0 1 0 0 0 2 1 1 10 0DDD 0 1 0 1 13 0 0 12 1 0 4 0 0 1 0 1 13 0 0 12 1 0 4 0 0 1 0 1 13 0 0 12 1 0 4 0CCC 0 0 1 0 0 0 0 0 0 2 2 1 0 0 1 0 0 0 0 0 0 2 2 1 0 0 1 0 0 0 0 0 0 2 2 1QQQ 1 1 21 8 10 0 0 7 6 0 0 2 1 1 21 8 10 0 0 7 6 0 0 2 1 1 21 8 10 0 0 7 6 0 0 2EEE 2 0 0 9 21 0 0 15 7 3 3 0 2 0 0 9 21 0 0 15 7 3 3 0 2 0 0 9 21 0 0 15 7 3 3 0GGG 9 7 1 4 0 0 8 0 0 0 46 0 9 7 1 4 0 0 8 0 0 0 46 0 9 7 1 4 0 0 8 0 0 0 46 0HHH 4 3 1 1 2 0 0 2 2 0 5 0 4 3 1 1 2 0 0 2 2 0 5 0 4 3 1 1 2 0 0 2 2 0 5 0III 10 0 11 1 2 10 0 4 9 3 0 16 10 0 11 1 2 10 0 4 9 3 0 16 10 0 11 1 2 10 0 4 9 3 0 16LLL 16 1 17 0 1 31 0 3 11 24 0 14 16 1 17 0 1 31 0 3 11 24 0 14 16 1 17 0 1 31 0 3 11 24 0 14KKK 3 4 5 10 11 1 1 13 10 0 5 2 3 4 5 10 11 1 1 13 10 0 5 2 3 4 5 10 11 1 1 13 10 0 5 2MMM 7 1 1 0 0 0 0 0 5 7 1 8 7 1 1 0 0 0 0 0 5 7 1 8 7 1 1 0 0 0 0 0 5 7 1 8FFF 4 0 3 0 0 4 0 0 0 10 0 0 4 0 3 0 0 4 0 0 0 10 0 0 4 0 3 0 0 4 0 0 0 10 0 0PPP 0 6 0 1 0 0 0 0 0 0 0 0 0 6 0 1 0 0 0 0 0 0 0 0 0 6 0 1 0 0 0 0 0 0 0 0SSS 1 17 0 8 3 1 3 0 2 2 2 0 1 17 0 8 3 1 3 0 2 2 2 0 1 17 0 8 3 1 3 0 2 2 2 0TTT 5 22 3 11 1 5 0 2 2 2 0 5 5 22 3 11 1 5 0 2 2 2 0 5 5 22 3 11 1 5 0 2 2 2 0 5WWW 2 0 0 0 0 0 0 0 0 1 0 1 2 0 0 0 0 0 0 0 0 1 0 1 2 0 0 0 0 0 0 0 0 1 0 1YYY 1 0 4 2 0 1 0 0 2 4 0 1 1 0 4 2 0 1 0 0 2 4 0 1 1 0 4 2 0 1 0 0 2 4 0 1VVV 6 3 1 1 2 15 0 0 2 12 0 28 6 3 1 1 2 15 0 0 2 12 0 28 6 3 1 1 2 15 0 0 2 12 0 28BLOCK, Weight Matrix orPosition Specific Scoring MatrixBLOCK, Weight Matrix orBLOCK, Weight Matrix orPosition Specific Scoring MatrixPosition Specific Scoring Matrix© Doug Brutlag, 1999© Doug Brutlag, 1999Sequence Alignment ProblemSequence Alignment ProblemC A T T GT C A T G© Doug Brutlag, 1999© Doug Brutlag, 1999Sequence Alignment ProblemSequence Alignment ProblemC A T T GT C A T G© Doug Brutlag, 1999© Doug Brutlag, 1999Sequence Alignment ProblemSequence Alignment ProblemC A T T GT C A T GC A T T GT C A T G© Doug Brutlag, 1999© Doug Brutlag, 1999Sequence Alignment (exact)Sequence Alignment (exact)F--SGGNTHIYMNHVEQCKEILRREPKELCELVISGLPYKFRYLSTKE-QLK-YF--SGGNTHIYMNHVEQCKEILRREPKELCELVISGLPYKFRYLSTKE-QLK-YGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNX 220 230 240 250 XX 220 230 240 250 X| | ||| || | | | ||| | | | | || | ||| || | | | ||| | | | | |X 260 270 280 290 XX 260 270 280 290 X© Doug Brutlag, 1999© Doug Brutlag, 1999NeedlemanNeedleman--WunschWunsch Algorithm (1) Algorithm (1)© Doug Brutlag, 1999© Doug Brutlag, 1999NeedlemanNeedleman--WunschWunsch Algorithm (2) Algorithm (2)© Doug Brutlag, 1999© Doug Brutlag, 1999NeedlemanNeedleman--WunschWunsch Algorithm (3) Algorithm (3)© Doug Brutlag, 1999© Doug Brutlag, 1999NeedlemanNeedleman--WunschWunsch Algorithm (4) Algorithm (4)© Doug Brutlag, 1999© Doug Brutlag, 1999Sequence AlignmentSequence AlignmentF--SGGNTHIYMNHVEQCKEILRREPKELCELVISGLPYKFRYLSTKE-QLK-YF--SGGNTHIYMNHVEQCKEILRREPKELCELVISGLPYKFRYLSTKE-QLK-YGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNX 220 230 240 250 XX 220 230 240 250 X| : |::|||:||:| | | |||: : :| | | ::::: |:: ||
View Full Document