DOC PREVIEW
MIT 6 006 - Problem Set 2

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction to Algorithms: 6.006Massachusetts Institute of Technology September 25, 2007Professors Ron Rivest and Srini Devadas Handout 7Problem Set 2This problem set is due October 11 at 11:59PM.Solutions should be turned in through the course website in PS or PDF form using LaTeX. Thecourse website has links to a number of editors that are useful for writing in LaTeX.It is recommended that you download the LaTeX solution template for this problem set whichincludes placeholders for solutions.Exercises are for extra practice and should not be turned in.Exercises:1. CLRS 11.2-12. CLRS 11.2-23. CLRS 11.3-14. CLRS 11.3-31. Rotating Binary Search TreesIn this problem we’ll explore the rotation operation on binary search trees. As discussedin class, this operation changes the structure of a binary search tree without affecting theinorder of the underlying nodes. See CLRS section 13.2 (page 277) for a description of theoperation.(a) (6 points) Let l(v) denote the number of nodes in v’s left subtree. Let L(T ) be the sumof l(v) over all nodes of the tree T . Show that a right rotation decreases L(T ). Deducethat it is impossible to do more than O(n2) consecutive right rotations in an n nodetree, i.e., with NO left rotations mixed in.(b) (6 points) Show that on an n node left path (a tree where all children are left children)it is possible to do Ω(n2) consecutive right rotations. (Again, no left rotations allowed.)2. How Good is this Hash?The following hash functions are used to hash strings of characters. Describe at least onestrength and one weakness of each of the hash functions. (Don’t try to do a rigorous mathe-matical analysis of the properties of the functions.)(a) (6 points) Direct addressing, using the first two bytes of the hash key as an address intoa 65,536-entry hashtable. (Assume all keys are at least two bytes long.)2 Handout 7: Problem Set 2(b) (6 points) Add the values of all bytes in the hash key into a single-byte counter, ignoringoverflow. Use the result as an index into a 256-element hash table.(c) (6 points) Start with a fixed prime number constant p0and a second small number s.Set p := p0. For each byte biin the hash key, set p := p+(p << s)+bi, where p << smeans “shift p left by s bit positions”. Mod the final value of p by a prime number toproduce the hash code. (Just describe what could be good or bad about this schemedepending on the values of p0and s.)3. Longest Common SubstringHumans have 23 pairs of chromosomes, while other primates like chimpanzees have 24pairs. Biologists claim that human chromosome #2 is a fusion of two primate chromosomesthat they call 2a and 2b. We wish to verify this claim by locating long nucleotide chainsshared between the human and primate chromosomes.We define the longest common substring of two strings to be the longest contiguous stringthat is a substring of both strings e.g. the longest common substring of DEADBEEF andEA7BEEF is BEEF.1If there is a tie for longest common substring, we just want to find oneof them.(a) (2 points) Ben Bitdiddle wrote a python program to find the longest common substringof two strings. What is the asymptotic running time of his code? Assume |s| = |t| = n.def longest_substring(s, t):"Finds the longest substring that occurs in both s and t"best = ’’for s_start in range(0, len(s)+1):for s_end in range(s_start, len(s)+1):for t_start in range(0, len(t)+1):for t_end in range(t_start, len(t)+1):if(s[s_start:s_end] == t[t_start:t_end]):current = s[s_start:s_end]if(len(current) > len(best)):best = currentreturn best(b) (2 points) Describe a simple algorithm that finds the longest common substring inO(n3) time.(c) (2 points) Describe a simple algorithm that finds the longest common substring inO(n2log n) time.1Do not confuse this with the longest common subsequence, in which the characters do not need to be contiguous.The longest common subsequence of DEADBEEF and EA7BEEF is EABEEF.Handout 7: Problem Set 2 3(d) (12 points) Describe an algorithm that finds the longest common substring in O(n log n)time. If you use Rabin-Karp hashing, be sure to pick a specific hash function, describewhy it works well for this problem, and discuss how you will handle collisions.(e) (12 points) Implement your algorithm from part (d). Be sure to thoroughly commentyour code so the course staff can read it. Some simple test cases are available fordownload on the class website.The human chromosome 2 and the chimp chromosomes 2a and 2b are quite large (over100,000,000 nucleotides each) so we took the first and last million nucleotides of eachchromosome and put them in separate files. These files are available on the class web-site, and also in /mit/6.006/dna/chr2 first 1000000 contains the first million nucleotides of human chromosome2, and chr2a first 1000000 contains the first million nucleotides of chimpanzeechromosome 2a. Note: these files contain both uppercase and lowercase letters that areused by biologists to distinguish between parts of the chromosomes called introns andextrons.i. How long is the longest common substring ofchr2 first 1000000 and chr2a first 1000000?ii. How long is the longest common substring ofchr2 first 1000000 and chr2b first 1000000?iii. How long is the longest common substring ofchr2 last 1000000 and chr2a last 1000000?iv. How long is the longest common substring ofchr2 last 1000000 and chr2b last 1000000?When you are finished, submit your code through the class website.(f) Optional: Make your algorithm run in O(n log k) time, where k is the length of thelongest common substring.(g) Optional: Run your algorithm on the entire chromosomes. They are available in/mit/6.006/dna in compressed


View Full Document

MIT 6 006 - Problem Set 2

Documents in this Course
Quiz 1

Quiz 1

7 pages

Quiz 2

Quiz 2

12 pages

Quiz 2

Quiz 2

9 pages

Quiz 1

Quiz 1

10 pages

Quiz 2

Quiz 2

11 pages

Quiz 1

Quiz 1

12 pages

Graphs

Graphs

27 pages

Load more
Download Problem Set 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?