DOC PREVIEW
CMU BSC 03510 - Sequence Alignment
Pages 44

This preview shows page 1-2-3-21-22-23-42-43-44 out of 44 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Sequence AlignmentOutlineFrom LCS to Alignment: Change up the ScoringSimple ScoringThe Global Alignment ProblemScoring MatricesMeasuring SimilarityPercent Sequence IdentityMaking a Scoring MatrixScoring Matrix: ExampleConservationScoring matricesPAMPAMXBLOSUMThe Blosum50 Scoring MatrixLocal vs. Global AlignmentSlide 19Local vs. Global Alignment (cont’d)Local Alignment: ExampleLocal Alignments: Why?The Local Alignment ProblemThe Problem with this ProblemSlide 25Slide 26Slide 27Slide 28Slide 29Slide 30Local Alignment: Running TimeLocal Alignment: Free RidesThe Local Alignment RecurrenceSlide 34Scoring Indels: Naive ApproachAffine Gap PenaltiesAccounting for GapsSlide 38Affine Gap Penalties and Edit GraphAdding “Affine Penalty” Edges to the Edit GraphManhattan in 3 LayersAffine Gap Penalties and 3 Layer Manhattan GridSwitching between 3 LayersThe 3-leveled Manhattan GridAffine Gap Penalty Recurrenceswww.bioalgorithms.infoAn Introduction to Bioinformatics AlgorithmsSequence AlignmentAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoOutline•Global Alignment •Scoring Matrices•Local Alignment•Alignment with Affine Gap PenaltiesAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoFrom LCS to Alignment: Change up the Scoring•The Longest Common Subsequence (LCS) problem—the simplest form of sequence alignment – allows only insertions and deletions (no mismatches). •In the LCS Problem, we scored 1 for matches and 0 for indels•Consider penalizing indels and mismatches with negative scores•Simplest scoring schema: +1 : match premium -μ : mismatch penalty -σ : indel penaltyAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoSimple Scoring•When mismatches are penalized by –μ, indels are penalized by –σ, and matches are rewarded with +1, the resulting score is: #matches – μ(#mismatches) – σ (#indels)An Introduction to Bioinformatics Algorithms www.bioalgorithms.infoThe Global Alignment ProblemFind the best alignment between two strings under a given scoring schemaInput : Strings v and w and a scoring schemaOutput : Alignment of maximum score↑→ = -б = 1 if match = -µ if mismatch si-1,j-1 +1 if vi = wjsi,j = max s i-1,j-1 -µ if vi ≠ wj s i-1,j - σ s i,j-1 - σ  : mismatch penaltyσ : indel penaltyAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoScoring Matrices To generalize scoring, consider a (4+1) x(4+1) scoring matrix δ. In the case of an amino acid sequence alignment, the scoring matrix would be a (20+1)x(20+1) size. The addition of 1 is to include the score for comparison of a gap character “-”.This will simplify the algorithm as follows: si-1,j-1 + δ (vi, wj)si,j = max s i-1,j + δ (vi, -) s i,j-1 + δ (-, wj)An Introduction to Bioinformatics Algorithms www.bioalgorithms.infoMeasuring Similarity•Measuring the extent of similarity between two sequences•Based on percent sequence identity•Based on conservationAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoPercent Sequence Identity•The extent to which two nucleotide or amino acid sequences are invariantA C C T G A G – A G A C G T G – G C A G70% identicalmismatchindelAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoMaking a Scoring Matrix•Scoring matrices are created based on biological evidence. •Alignments can be thought of as two sequences that differ due to mutations. •Some of these mutations have little effect on the protein’s function, therefore some penalties, δ(vi , wj), will be less harsh than others.An Introduction to Bioinformatics Algorithms www.bioalgorithms.infoScoring Matrix: ExampleA R N KA 5 -2 -1 -1R - 7 -1 3N - - 7 0K - - - 6• Notice that although R and K are different amino acids, they have a positive score.• Why? They are both positively charged amino acids will not greatly change function of protein.An Introduction to Bioinformatics Algorithms www.bioalgorithms.infoConservation•Amino acid changes that tend to preserve the physico-chemical properties of the original residue•Polar to polar•aspartate  glutamate•Nonpolar to nonpolar•alanine  valine•Similarly behaving residues•leucine to isoleucineAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoScoring matrices•Amino acid substitution matrices•PAM•BLOSUM•DNA substitution matrices•DNA is less conserved than protein sequences•Less effective to compare coding regions at nucleotide levelAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoPAM•Point Accepted Mutation (Dayhoff et al.)•1 PAM = PAM1 = 1% average change of all amino acid positions•After 100 PAMs of evolution, not every residue will have changed•some residues may have mutated several times•some residues may have returned to their original state•some residues may not changed at allAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoPAMX•PAMx = PAM1x•PAM250 = PAM1250•PAM250 is a widely used scoring matrix: Ala Arg Asn Asp Cys Gln Glu Gly His Ile Leu Lys ... A R N D C Q E G H I L K ...Ala A 13 6 9 9 5 8 9 12 6 8 6 7 ...Arg R 3 17 4 3 2 5 3 2 6 3 2 9Asn N 4 4 6 7 2 5 6 4 6 3 2 5Asp D 5 4 8 11 1 7 10 5 6 3 2 5Cys C 2 1 1 1 52 1 1 2 2 2 1 1Gln Q 3 5 5 6 1 10 7 3 7 2 3 5...Trp W 0 2 0 0 0 0 0 0 1 0 1 0Tyr Y 1 1 2 1 3 1 1 1 3 2 2 1Val V 7 4 4 4 4 4 4 4 5 4 15 10An Introduction to Bioinformatics Algorithms www.bioalgorithms.infoBLOSUM•Blocks Substitution Matrix •Scores derived from observations of the frequencies of substitutions in blocks of local alignments in related proteins•Matrix name indicates evolutionary distance•BLOSUM62 was created using sequences sharing no more than 62% identityAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoThe Blosum50 Scoring MatrixAn Introduction to Bioinformatics Algorithms www.bioalgorithms.infoLocal vs. Global Alignment•The Global Alignment Problem tries to find the longest path between vertices (0,0) and (n,m) in the edit graph.•The Local


View Full Document

CMU BSC 03510 - Sequence Alignment

Download Sequence Alignment
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sequence Alignment and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sequence Alignment 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?