DOC PREVIEW
MIT 6 006 - Problem Set 1

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction to Algorithms: 6.006Massachusetts Institute of Technology February 3rd, 2009Professors Sivan Toledo and Alan Edelman Handout 2Problem Set 1This problem set is divided into two parts: Part A problems are programming tasks, andPart B problems are theory questions.Part A questions are due Tuesday, February 17th at 11:59PM.Part B questions are due Thursday, February 19th at 11:59PM.Solutions should be turned in through the course website in PDF form using LATEX orscanned handwritten solutions.A template for writing up solutions in LATEX is available on the course website.Remember, your goal is to communicate. Full credit will be given only to the correctsolution which is described clearly. Convoluted and obtuse descriptions might receivelow marks, even when they are correct. Also, aim for concise solutions, as it will save youtime spent on write-ups, and also help you conceptualize the key idea of the problem.Part A: Due Tuesday, February 17th1. (20 points) Running TimeVersion 6 of our Document Distance code uses an algorithm called merge sort toimprove upon the Θ(n2) running time of insertion sort. (We’ll talk more about mergesort and other sorting algorithms in a few weeks.)You can find an implementation of merge sort on page 3 of this document.(a) (10 points) Determine experimentally the running time of mergesort, by run-ning it on different-sized lists of random numbers. (After you import random,you can get a random floating-point number using the random.random()function.)Fill in the following chart. Include in your PDF submission a snippet of codethat determines one of the entries in the chart.|s| = 102|s| = 103|s| = 104|s| = 105time in msThere are a number of ways to time code. You can use the timeit module1Al-ternatively, if you have ipython installed,2you can use the more user-friendlybuiltin %timeit command. Make sure you check what the default number ofiterations for your timing command is! By default, Timer.timeit() in the1See http://www.diveintopython.org/performance_tuning/timeit.html for a good de-scription of how to use the timeit module.2See http://ipython.scipy.org/. Highly recommended.2 Handout 2: Problem Set 1timeit module runs your command one million times. You should change it tobe high enough so your results don’t have too much variance, but low enoughthat you don’t die of boredom.(b) (10 points) Give an approximate formula for asymptotic running time of mergesort based on your experiments. Justify your answer by dividing your numbersfrom the chart above by the formula, and showing that the result is approxi-mately constant.2. (30 points) Set IntersectionPython has a built in set data structure. A set is a collection of elements withoutrepetition.In an interactive Python session, type the following to create an empty set:s = set()We can also create a set from a list:l = [1, 2, 3]s = set(l)To find out what operations are available on sets, type:dir(s)Some fundamental operations of sets include add, remove, and containsand len . Note that the contains and len methods are morecommonly called with the syntax element in s and len(s). (While you can uses. contains (element) and it will work just fine, it’s not very nice-looking,and people will look at you funny if you write it that way.) All four of these opera-tions run in constant time i.e. O(1) time.s.intersection(t) that takes two sets, s and t, and returns a new set containingall the elements that occur in both s and t. We will use intersection in a newversion of the Document Distance code from the first two lectures.In the Document Distance problem from the first two lectures, we compared twodocuments by counting the words in each, treating these counts as vectors, andcomputing the angle between these two vectors. For this problem, we will changethe Document Distance code to use a new metric. Now, we will only care aboutwords that show up in both documents, and we will ignore the contributions ofwords that only show up in one document.Download ps1.py, docdist7.py, and test-ps1.py from the class website.docdist7.py is mostly the same as docdist6.py seen in class, however it doesnot implement vector angle or inner product; instead, it imports those func-tions from ps1.py. Currently, ps1.py contains code copied from docdist6.py,but you will need to modify this code to implement the new metric.Handout 2: Problem Set 1 3Merge sort code for Part A, Problem 1:def merge_sort(A):"""Sort list A into order, and return result."""n = len(A)if n==1:return Amid = n//2 # floor divisionL = merge_sort(A[:mid])R = merge_sort(A[mid:])return merge(L,R)def merge(L,R):"""Given two sorted sequences L and R, return their merge."""i = 0j = 0answer = []while i<len(L) and j<len(R):if L[i]<R[j]:answer.append(L[i])i += 1else:answer.append(R[j])j += 1if i<len(L):answer.extend(L[i:])if j<len(R):answer.extend(R[j:])return answer4 Handout 2: Problem Set 1(a) (15 points) Modify inner product to take a third argument, domain, whichwill be a set containing the words that occur in both texts. Modify the code sothat it only increases sum if the word is in domain.(b) (15 points) Modify vector angle so that it creates sets of the words in bothL1 and L2, takes their intersection, and uses that intersection when callinginner product.(Hint: You can probably make the needed changes in under a dozen lines of code.)Run test-ps1.py to make sure your modified code works. The same test suitewill be run when you submit ps1.py to the class website.Your code should not take significantly longer to run with the new metric. (Whynot?)Submit ps1.py on the class website. All code submitted for this class will bechecked for accuracy, asymptotic efficiency, and clarity.Part B: Due Thursday, February 19th1. (25 points) Asymptotic GrowthFor each group of six functions below, rank the functions by increasing order ofgrowth; that is, find an arrangement g1, g2, . . . , g6of the functions satisfying g1=O(g2), g2= O(g3), . . . , g5= O(g6). Partition each list into equivalence classes suchthat f(n) and g(n) are in the same class if and only if f(n) = Θ(g(n)).(a) (8 points) Group 1:f1(n) = 106006n2, f2(n) = 1/n, f3(n) = n6.006f4(n) = n, f5(n) = 1/6.006, f6(n) = 6.006n(b) (9 points) Group 2:f1(n) = n log n, f2(n) = log(log(n)), f3(n) =n50f4(n) = n1/50, f5(n) = log(n), f6(n) = n50(c) (8 points) Group 3:f1(n) = 2n, f2(n) = n2n, f3(n) = 2n+1f4(n) = 4n, f5(n) = n!, f6(n) = 3√n2. (25 points) Binary SearchIn Problem Solving With Algorithms And Data


View Full Document

MIT 6 006 - Problem Set 1

Documents in this Course
Quiz 1

Quiz 1

7 pages

Quiz 2

Quiz 2

12 pages

Quiz 2

Quiz 2

9 pages

Quiz 1

Quiz 1

10 pages

Quiz 2

Quiz 2

11 pages

Quiz 1

Quiz 1

12 pages

Graphs

Graphs

27 pages

Load more
Download Problem Set 1
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set 1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set 1 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?