03-511/711 Computational Genomics and Molecular Biology, Fall 2002 1Problem Set 0This homework is intended to be a self-administered placement quiz, to help you (and me) determineif you have the background for the course or need to read additional material. Collaboration is notallowed on this homework. Due in class on Thursday, September 12th1. (a) Given only the first two nucleotides of a codon, in which cases in the genetic code wouldyou fail to know the amino acid specified by that codon?(b) If you knew the amino acid specified by a codon, in which cases would you be unable todetermine its first two nucleotides?2. (a) State five ways in which genes and genomes differ in eukaryotic and prokaryotic cells.(b) When is the statement “a gene encodes a single protein” false?(c) Why is it possible for transcription and translation of the same gene to occur simulta-neously in bacteria but not eukaryotes?03-511/711 Computational Genomics and Molecular Biology, Fall 2002 23. (a) Considering both strands and both orientations, how many potential coding regions arethere in the double-stranded DNA sequence, shown below?Label these subsequences with arrows.TAC ATG ATC ATT TCA CGG AAT TTC TAG CAT GTAATG TAC TAG TAA AGT GCC TTA AAG ATC GTA CAT(b) You determine in the laboratory that this double-stranded DNA sequence produces, invivo, a polypeptide that is five amino acids long. Which strand of DNA is transcribed,and in which direction?Label the 5’ and 3’ ends of each strand:TAC ATG ATC ATT TCA CGG AAT TTC TAG CAT GTAATG TAC TAG TAA AGT GCC TTA AAG ATC GTA CATGive the amino acid sequence of the polypeptide encoded.(c) If an inversion occurs between the second and third triplets from the left and right ends,respectively, and the same strand of DNA is transcribed (starting at the same positionas in (b)), how long will the resultant polypeptide be?Give the amino acid sequence of the polypeptide encoded.03-511/711 Computational Genomics and Molecular Biology, Fall 2002 34. Bob’s blood type is AB. Alice has type A blood.(a) From this information, can we determine Bob’s genotype uniquely? If so, what allelesdoes Bob have? If not, what allelic types are consistent with the phenotype?(b) From this information, can we determine Alice’s genotype uniquely? If so, what allelesdoes Alice have?(c) Alice and Bob have a child with type B blood. Now, is it possible to uniquely determinethe blood types of Alice, Bob and their child? What are they?5. (a) Give the number of edges, in terms of k, in a binary tree with k leaves.(b) What is the maximum number of leaves in binary a tree of depth d?(c) Which of the following data structures would you use to keep track of the next node tovisit in implementing a breadth-first-search?A depth-first-search?i. A sparse matrixii. A stackiii. A hash tableiv. A queuev. A heap03-511/711 Computational Genomics and Molecular Biology, Fall 2002 46. A clique is an undirected graph G = (V, E) in which every pair of vertices is connected by anedge.(a) What is the degree of each vertex, v, in terms of |V |?(b) Give an expression for the number of edges, |E|, in terms of deg(vi).(c) Give an expression for the number of edges, |E|, in terms of |V |.(d) Which data structure would be an appropriate representation for a clique, an adjacencymatrix or adjacency lists? Why?7. Dr Victor Bogon, U Kansas at Oceanview, studies the bacterium Ridiculous maximus. Incoding sequences in this pathogen, the probability of an accepted mutation in the thirdposition of a codon is 0.01 per generation. The probability of an accepted mutation inthe first and second positions is 0.005 per generation. Accepted mutations in non-codingsequences occur at a rate of one per twenty base pairs per generation.(a) After one generation, exactly one mutation is observed in a coding sequence of 30 nu-cleotides. It occured in the third position of a codon. What is the probability of thisevent?Suppose that exactly one mutation is observed and it did not occur in a third position.Then, what is the probability of this event?What is the probability of observing exactly one mutation at any position in a sequenceof 30 basepairs in a coding region?03-511/711 Computational Genomics and Molecular Biology, Fall 2002 5(b) What is the expected number of generations before a mutation is accepted at a particularposition in the non-coding sequence of R. maximus?(c) Give an expression for the probability that a non-coding region of length r will changeat k positions in one generation.8. You work for a popsicle manufacturer and are asked to design and implement software tooptimize popsicle storage and delivery schedules. For each of the following scenarios, which ofthe following solutions should you consider? (It is possible that more than one is appropriate.)(i) Buy a faster computer.(ii) Develop a faster exact algorithm.(iii) Use a heuristic that runs in polynomial time and gives good, but suboptimal solutionson test data.(a) Your boss asks you to write a program to determine a strategy for storing popsiclesthat will minimize the amount of warehouse space required. After a little thought youare able to show that the popsicle packing problem is NP-complete. Your fastest exactpopsicle packing program takes a week to run on a fast desktop machine. In this era ofglobal warming, demand for popsicles is projected to double each year for the foreseeablefuture. Which of the above solutions should you consider and why?03-511/711 Computational Genomics and Molecular Biology, Fall 2002 6(b) Impressed by your performance, your boss asks you to tackle ordering supplies for theanchovy popsicle production line. You need to write a program to determine when toorder anchovies for the factory. It turns out that the anchovy ordering problem is alsoNP-complete. Your fastest anchovy ordering program runs overnight on a fast desktopmachine. Demand for anchovy popsicles is moderate and is not expected to increase.Which of the above solutions should you consider and why?(c) Elementary school students in the southwest have developed a craving for durian popsi-cles. With an optimal delivery schedule, it is just possible to deliver the durian popsiclesto the schools before they melt. In the face of the increased relocation of families withallergies to the southwest, demand for durian popsicles is expected to increase dramat-ically. The current popsicle delivery scheduler runs takes days to run, but the problemis not NP-complete. Which of the above
View Full Document