DOC PREVIEW
CMU BSC 03510 - Lectures
Pages 53

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computational Biology, Part 4 Similarity Matrices/Statistics of Pattern AppearanceDeriving and Using Similarity MatricesOrigin of PAM matricesSlide 4Slide 5Slide 6Slide 7Use of PAM matricesSlide 9Dayhoff PAM250 similarity matrix in log-odds formDayhoff PAM250 similarity matrix (partial)Updated PAM matricesBLOSUM62 matrixOrigin of BLOSUM matricesSlide 15BLOSUM62 in log-odds formComparison of PAM250 and BLOSUM62Sequence Analysis TasksStatistics of pattern appearanceDetermining mononucleotide frequenciesDetermining dinucleotide frequenciesDetermining conditional dinucleotide probabilitiesIllustration of probability calculationInteractive DemonstrationIllustration using dinucleotide probabilitiesExpansionsProofNeed further convincing?More complicated probability illustrationIllustration (continued)Slide 31Multiply then addHow do we program this?Will this work?No to forA recursive solutionSite Probability Calculation via RecursionPossibleSites.cPowerPoint PresentationSlide 40Slide 41Slide 42Another illustrationExpected number and spacingSlide 45Slide 46Probability of consecutive matchesSlide 48Expected longest match lengthKarlin-Altschul formulationEstimating Significance of Local AlignmentSlide 52Reading for next classComputational Biology, Part 4Similarity Matrices/Statistics of Pattern AppearanceComputational Biology, Part 4Similarity Matrices/Statistics of Pattern AppearanceRobert F. MurphyRobert F. MurphyCopyright Copyright  1996-2007. 1996-2007.All rights reserved.All rights reserved.Deriving and Using Similarity MatricesDeriving and Using Similarity MatricesOrigin of PAM matricesOrigin of PAM matricesTake aligned set of closely related proteinsTake aligned set of closely related proteins71 groups of proteins that were at least 85% 71 groups of proteins that were at least 85% similarsimilarEach group of sequences were organized into a Each group of sequences were organized into a phylogenetic treephylogenetic treeCreates a model of the order in which Creates a model of the order in which substitutions occurredsubstitutions occurredCount the number of changes of each amino acid Count the number of changes of each amino acid into every other amino acidinto every other amino acidEach substitution is considered to be an Each substitution is considered to be an “accepted mutation” - an amino acid change “accepted mutation” - an amino acid change “accepted” by natural selection“accepted” by natural selectionOrigin of PAM matricesOrigin of PAM matricesFor each group of proteins, find the “exposure to For each group of proteins, find the “exposure to mutation” for each amino acid. Product of mutation” for each amino acid. Product of the frequency of each amino acid in that groupthe frequency of each amino acid in that groupthe number of all amino acid changes per 100 the number of all amino acid changes per 100 residues (total number of amino acid changes residues (total number of amino acid changes divided by the combined length of all sequences divided by the combined length of all sequences in that group, then times 100)in that group, then times 100)For each group, divide counts of changes for each For each group, divide counts of changes for each amino acid pair by the exposure to mutation of the amino acid pair by the exposure to mutation of the “original” amino acid“original” amino acidAverage these across all groups to create PAM1 Average these across all groups to create PAM1 matrix (Point Accepted Mutation at 1% change)matrix (Point Accepted Mutation at 1% change)Origin of PAM matricesOrigin of PAM matricesThis table is equivalent to a transition matrix for a This table is equivalent to a transition matrix for a first-order Markov model for protein sequence first-order Markov model for protein sequence evolution with a 1% overall probability of changeevolution with a 1% overall probability of changeAppropriate for comparing sequences separated by Appropriate for comparing sequences separated by an evolutionary distance that would yield changes an evolutionary distance that would yield changes in 1% of the positionsin 1% of the positionsNote that PAM1 is not symmetricNote that PAM1 is not symmetricTo compare sequences across greater distances, To compare sequences across greater distances, can multiply the PAM1 matrix by itself (if Markov can multiply the PAM1 matrix by itself (if Markov model is correct)model is correct)Origin of PAM matricesOrigin of PAM matricesSquaring PAM1 considers all the ways that an Squaring PAM1 considers all the ways that an “original” amino acid may have changed over two “original” amino acid may have changed over two steps of 1% mutation rate eachsteps of 1% mutation rate eachFor staying the same, sum probability that it didn’t For staying the same, sum probability that it didn’t change in first step times probability that it didn’t change in first step times probability that it didn’t change in second step plus product of all the change in second step plus product of all the probability of all changes in first step times probability of all changes in first step times probability of changing backprobability of changing backFor changing from For changing from x x -> -> yy, consider sum of , consider sum of products of all the changes that could have products of all the changes that could have happened in first step (happened in first step (x x -> -> zz) times probability of ) times probability of changing from that into y (changing from that into y (z z -> -> yy))This gives PAM2 (still not symmetric!)This gives PAM2 (still not symmetric!)Origin of PAM matricesOrigin of PAM matricesCan raise PAM1 to any power (e.g., Can raise PAM1 to any power (e.g., PAM250)PAM250)Major effect of raising PAM matrix to a Major effect of raising PAM matrix to a power is to decrease the probability that a power is to decrease the probability that a particular amino acid is unchanged (and particular amino acid is unchanged (and increase the probabilities for it to change increase the probabilities for it to change into all others)into all others)Use of PAM matricesUse of PAM matricesSum of the product of diagonal elements times Sum of the product of diagonal elements times overall frequency of each amino acid gives overall frequency of each amino acid gives expected degree of similarity between two proteins expected degree of similarity between two


View Full Document

CMU BSC 03510 - Lectures

Download Lectures
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lectures and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lectures 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?