Due Feb 25 2014 MCB 432 Name Assignment 4 Binomial and Poisson Distributions key 30 pt total This assignment assumes that you have read the handout on the Binomial and Poisson Distributions from the course WWW site Show your work Incorrect answers can only get partial credit if work is shown Report your final answers to 3 significant figures unless otherwise stated Use decimal fractions as in 0 0123 for probabilities 0 01 and scientific notation as in 1 23 x 10 6 for probabilities 0 01 I am not amused by probabilities less than 0 or greater than 1 1 When aligning two DNA sequences with equal frequencies of all 4 bases A C G and T there is a random probability of 0 25 that two bases are identical p 0 25 and a probability of 0 75 that two bases differ q 0 75 Use the binomial distribution to answer the following 1a What is the probability that there will be exactly 15 identities in 15 aligned nucleotides of random DNA sequence n p m q n m n 15 m 15 p 0 25 q 0 75 m n m 15 15 P 15 0 25150 7515 15 0 25150 750 0 2515 9 31 10 10 15 15 15 15 0 2 pt P m 1b What is the probability that there will be exactly 14 identities in 15 aligned nucleotides of random DNA sequence 15 15 0 2514 0 7515 14 0 2514 0 751 15 0 2514 0 751 4 19 10 8 14 15 14 14 1 exactly 45 times 1a 1c What is the probability of 13 or more identities in the 15 aligned positions of random DNA sequence 2 pt P 13 P 13 P 14 P 15 15 15 P 13 0 25130 7515 13 0 25130 752 15 7 0 2513 0 752 8 80 10 7 13 15 13 13 2 P 13 P 13 P 14 P 15 8 80 0 419 0 009 10 7 9 23 10 7 2 pt P 14 2 In problem 1 we had equal quantities of the 4 nucleotides so the random probability of identical nucleotides in two aligned sequences is 0 25 i e p 0 25 Consider a DNA database in which the ratio of A C G T in the sequences is 0 35 0 15 0 15 0 35 as would be the case for a genome sequence from an organism has a 30 G C content and searching this database with a query sequence with A C G T equal to 0 15 0 35 0 35 0 15 If I take a random nucleotide from the database and a random nucleotide from the query sequence what is the probability that they are the same nucleotide For full credit you must describe your approach or show enough work to make clear how you solved the problem 2 pt There are 4 ways to get identical nucleotides database A and query A database C and query C database G and query G database T and query T so P identical P A A P C C P G G P T T If fd N is the frequency of N in the database and fq N is the frequency of N in the query P identical fd A fq A fd C fq C fd G fq G fd T fq T 0 35 0 15 0 15 0 35 0 15 0 35 0 35 0 15 0 210 3 Consider a die with the numbers 1 10 each number supposedly equally likely use the Binomial to answer the following questions that consider whether the die is actually fair Page 2 Assignment 4 Name 3a The die is thrown 40 times and the number 7 does not appear If the die were fair what would be the probability of zero occurrences of a 7 in the 40 throws 2 pt n 40 m 0 p 0 1 q 0 9 40 40 P 0 7 s 0 10 0 9 40 0 0 10 0 9 40 0 9 40 0 0148 1 48 10 2 0 40 0 0 40 3b Is this probability significantly rare P 0 05 1 pt Yes it is significantly rare 3c In 3a the question was not formulated until we saw that 7 was rare a common statistical mistake We would have been equally concerned if any of the 10 numbers had not appeared We can work out this latter probability in steps What is the probability that a 1 appears at least once 2 pt P 1 1 s 1 P 0 1 s 1 0 0148 0 985 3d What is the probability that a 2 appears at least once 1 pt P 1 2 s 1 P 0 2 s 1 0 0148 0 985 the same as 3c 3e What is the probability that a 1 appears and least once and a 2 appears at least once 2 pt P 1 1 s and 1 2 s P 1 1 s P 1 2 s 0 985 0 985 0 985 2 0 970 3f Extending the idea in 3e what is the probability that all 10 numbers appear at least once each 2 pt P 1 1 s and 1 2 s and P 1 1 s P 1 2 s 0 985 10 0 860 3g Given the answer to 3f what is the probability that at least one number does not appear at all 2 pt P at least one number does not appear 1 P all numbers appear 1 0 860 0 140 3h Is this probability the answer to 3g significantly rare P 0 05 1 pt No it is not significantly rare 4 Assuming the same die as in problem 3 use the Poisson distribution to answer the following 4a What is the probability that there will be 0 occurrences of a 7 in 40 throws 2 pt 40 throws 1 10 expected per throw 4 m 0 P m m e m P 0 4 0 e 4 0 e 4 0 0183 1 83 10 2 not quite the same as the exact answer in 3a 4b What is the probability that there will be 1 or fewer occurrences of a 7 in 40 throws 2 pt P 1 P 0 P 1 P 1 41 e 4 1 4e 4 0 0733 7 33 10 2 P 1 0 0183 0 0733 0 0916 9 16 10 2 4c What is the probability that there will be 2 or more occurrences of a 7 in 40 throws 2 pt P 2 1 P 1 1 0 0916 0 908 5 BLAST reports the significance of a similarity in terms of an E value that is the expected number think Poisson of random matches with the observed match score or greater 5a If the E value of a match is 1 10 17 what is the probability of 1 or more matches with this score or greater for a random query sequence 2 pt P …

