Computational Biology, Part 3 Sequence AlignmentSequence AlignmentExample sequence alignmentSlide 4Slide 5Slide 6Matching Similarity vs. IdentityGlobal vs. Local AlignmentSlide 9Slide 10Why do sequence alignments?Origin of similar genesMethods for Pairwise AlignmentSequence comparison with dot matricesSlide 15Examples for protein sequencesInterpretation of dot matricesSlide 18Slide 19Slide 20Uses for dot matricesSlide 22Slide 23Filtering to remove “noise”Example spreadsheet with windowHow do we choose a window size?How do we choose a threshold value?Dot matrix analysis with Matlab bioinformatics toolboxMatlab codeDot matrixDot matrix analysis with DotmatcherSlide 32Slide 33Slide 34Slide 35Slide 36Slide 37Computational Biology, Part 3Sequence AlignmentComputational Biology, Part 3Sequence AlignmentRobert F. MurphyRobert F. MurphyCopyright Copyright 1996, 1999-2009. 1996, 1999-2009.All rights reserved.All rights reserved.Sequence AlignmentSequence AlignmentDefinition: Procedure for comparing two or Definition: Procedure for comparing two or more sequences by searching for a series of more sequences by searching for a series of individual characters or character patterns individual characters or character patterns that are that are in the same orderin the same order in the sequences in the sequencesPair-wise alignmentPair-wise alignment: compare two sequences: compare two sequencesMultiple sequence alignmentMultiple sequence alignment: compare more : compare more than two sequencesthan two sequencesExample sequence alignmentExample sequence alignmentTask: align Task: align “abcdef”“abcdef” with with “abdgf”“abdgf”Write second sequence below the firstWrite second sequence below the firstabcdefabcdefabdgfabdgfMove sequences to give maximum match between Move sequences to give maximum match between themthemShow characters that match using vertical barShow characters that match using vertical barExample sequence alignmentExample sequence alignmentabcdefabcdef||||abdgfabdgfInsert gap between Insert gap between bb and and dd on lower on lower sequence to allow sequence to allow dd and and ff to align to alignExample sequence alignmentExample sequence alignmentabcdefabcdef|| | ||| | |ab-dgfab-dgfExample sequence alignmentExample sequence alignmentabcdefabcdef|| | ||| | |ab-dgfab-dgfNote Note ee and and gg don’t match don’t matchMatching Similarity vs. IdentityMatching Similarity vs. IdentityAlignments can be based on finding only Alignments can be based on finding only identical characters, or (more commonly) identical characters, or (more commonly) can be based on finding can be based on finding similarsimilar characters charactersMore on how to define More on how to define similaritysimilarity later laterGlobal vs. Local AlignmentGlobal vs. Local AlignmentWe distinguishWe distinguishGlobalGlobal alignment algorithms which optimize alignment algorithms which optimize overall overall alignment between two sequences alignment between two sequences LocalLocal alignment algorithms which seek only alignment algorithms which seek only relatively relatively conservedconserved pieces of sequence pieces of sequenceAlignment stops at the ends of regions of strong Alignment stops at the ends of regions of strong similaritysimilarityFavors finding conserved patterns in otherwise Favors finding conserved patterns in otherwise different pairs of sequencesdifferent pairs of sequencesGlobal vs. Local AlignmentGlobal vs. Local AlignmentGlobalGlobalLGPSSKQTGKGS-SRIWDNLGPSSKQTGKGS-SRIWDN| | ||| | | | | ||| | | LN-ITKSAGKGAIMRLGDALN-ITKSAGKGAIMRLGDALocalLocal--------GKG----------------GKG-------- ||| ||| --------GKG----------------GKG--------Global vs. Local AlignmentGlobal vs. Local AlignmentGlobalGlobalLGPSSKQTGKGS-SRIWDNLGPSSKQTGKGS-SRIWDN| | ||| | | | | ||| | | LN-ITKSAGKGAIMRLGDALN-ITKSAGKGAIMRLGDALocalLocal-------TGKG---------------TGKG-------- ||| ||| -------AGKG---------------AGKG--------Why do sequence alignments?Why do sequence alignments?To find whether two (or more) genes or To find whether two (or more) genes or proteins are evolutionarily related to each proteins are evolutionarily related to each otherotherTo find structurally or functionally similar To find structurally or functionally similar regions within proteinsregions within proteinsOrigin of similar genesOrigin of similar genesSimilar genes arise by Similar genes arise by gene gene duplicationduplicationCopy of a gene inserted next to Copy of a gene inserted next to the originalthe originalTwo copies mutate Two copies mutate independentlyindependentlyEach can take on separate Each can take on separate functionsfunctionsAll or part can be transferred All or part can be transferred from one part of genome to from one part of genome to anotheranotherhttp://fig.cox.miami.edu/~cmallery/150/gene/c7.19.19.gene.family.jpgMethods for Pairwise AlignmentMethods for Pairwise AlignmentDot matrix analysisDot matrix analysisDynamic ProgrammingDynamic ProgrammingWord or Word or k-k-tuple methods (FASTA and tuple methods (FASTA and BLAST)BLAST)Sequence comparison with dot matricesSequence comparison with dot matricesGoal: Goal: Graphically display regions of Graphically display regions of similarity between two sequences (e.g., similarity between two sequences (e.g., domains in common between two proteins domains in common between two proteins of suspected similar function)of suspected similar function)Sequence comparison with dot matricesSequence comparison with dot matricesBasic Method: Basic Method: For two sequences of For two sequences of lengths M and N, lay out an M by N grid lengths M and N, lay out an M by N grid (matrix) with one sequence across the top (matrix) with one sequence across the top and one sequence down the left side. For and one sequence down the left side. For each position in the grid, compare the each position in the grid, compare the sequence elements at the top (column) and sequence elements at the top (column) and to the left (row). If and only if they are the to the left (row). If and only if they are the same, place a dot at that position.same, place a dot at that position.Examples for protein sequencesExamples for protein sequences(Demonstration A6, Sequence 1 vs. 2)(Demonstration A6, Sequence 1 vs.
View Full Document