Berkeley STATISTICS 246  Molecular evolution
Molecular evolution
Lecture Notes
 University of California, Berkeley
 Statistics 246  Statistical Genetics
Unformatted text preview:
Molecular evolution cont Lecture 14 Statistics 246 March 9 2004 1 Scoring matrices Scoring or substitution matrices are expressions measuring the evolutionary similarity of amino acids or nucleotide bases at different evolutionary distances They take the form S a b log f a b a b where f a b is a joint distribution on pairs and is the background distribution Usually f is symmetric and is the common marginal distribution At times people introduce scoring matrices without a probabilistic justification and we will mention a couple later The two most widely used scoring matrices for sequence alignment the PAM and the BLOSUM series are associated with implicit statistical tests based on models They test a null hypothesis of non homology versus an alternative of homology similarity doe to common ancestry at a given evolutionary distance for the sequences being compared 2 Scoring matrices for alignment The statistical ideas underlying the scoring matrices are of interest and value in themselves The PAM series are based the Markov chain models from molecular evolution that we met in the last lecture i e they are of the type we used to correct observed sequence distances Jukes Cantor Kimura etc and which we will also see used in in maximum likelihood phylogenetic inference The derivation of the BLOSUM series is different but nonetheless interesting As we will not be discussing either local or global sequence alignment pairwise or multiple I ll refer you to earlier versions of this course or the many excellent discussions in the literature on the topic These include but are not restricted to the well known books by M S Waterman 1995 and by R Durbin et al 1998 3 How scoring matrices work 134 LQQGELDLVMTSDILPRSELHYSPMFDFEVRLVLAPDHPLASKTQITPEDLASETLLI 137 LDSNSVDLVLMGVPPRNVEVEAEAFMDNPLVVIAPPDHPLAGERAISLARLAEETFVM BLOSUM62 C S T P A G N D E Q H R K M I L V F Y W 9 1 1 3 0 3 3 3 4 3 3 3 3 1 1 1 1 2 2 2 4 1 1 1 0 1 0 0 0 1 1 0 1 2 2 2 2 2 3 D D 6 5 1 0 2 0 1 1 1 2 1 1 1 1 1 0 2 2 2
