Unformatted text preview:

Massachusetts Institute of Technology Handout 36.857: Network and Computer Security Septemb e r 18, 2006Professor Ronald L. Rivest Due: October 2, 2006Problem Set 2This problem set is due, via email, to [email protected] by Monday, October 2 at thebeginning of class.You are to work on this problem set in groups of three or four people. Problems turned in by individuals,pairs, pentuples, etc. will not be accepted. Be sure that all group members can explain the solutions. SeeHandout 1 (Course Information) for our policy on collaboration. If you do not have a group, let us know.Homework must be submitted electronically! Each problem answer must appear on a separate page. Markthe top of each page with your group member names, the course number (6.857), the problem se t numberand question, and the date. We have provided templates for LATEX and Microsoft Word on the course website(see the Resources page).Grading and Late Policy: Each problem is worth 10 points. Late homework will not be acceptedwithout prior approval. Homework should not be submitted by email except with prior approval. (Somebodyfrom your group should be in class on the day that the homework is due.)With the authors’ permission, we will distribute our favorite solution to each problem as the “official”solution—this is your chance to become famous! If you do not wish for your homework to be used as anofficial solution, or if you wish that it only be used anonymously, please note this on your homework.Problem 2-1. Finding Hash Coll isions using Cycle DetectionLet h b e a hash function that maps some finite domain D onto itself (e.g., D is the domain describing allunique 128-bit binary values). Given initial element x0∈ D, consider the infinite sequence: x1= h(x0), x2=h(x1), x3= h(x2), ....Because D is finite, this sequence must eventually repeat a value. This will lead the sequence to subsequentlycycle through the same sub-se quence from then on. Such a cycle can be used to find a collision for h.Specifically, let xibe the first value encountered in the cycle. Let xjbe the last value in the cycle beforewe return to xi. It follows that h(xi−1) = h(xj) = xi. Therefore, if we can detect where a cycle begins in asuch a hash sequence we can identify a collision for the hash function.There exist quite a few algorithms for detecting these cycles without using a prohibitive amount of me mory.Consider the following, stack-based cycle detection algorithm, described by Nivasch at http://yucs.org/~gnivasch/stackalg/ (under the heading The basic stack algorithm):[The] basic algorithm is as follows: Keep a stack of pairs (xi, i), where, at all times, both thei’s and the xi’s in the stack form strictly increasing sequences from bottom to top. The stack isinitially empty. At each step j, pop from the stack all entries (xi, i) where xi> xj. If an xi= xjis found in the stack, we are done; then the cycle length is equal to j − i. Otherwise, push (xj, j)on top of the stack and continue.It is not too difficult to show that the size of this stack is bounded by O(log |D|), with high probability.This is a big improvement over an approach such as the standard birthday attack, which would require, onaverage, for the attacker to store up to 2|D|2values.Your Task In this problem, we ask you to implement this stack-based cycle-detection algorithm using thePython programming language. We then ask you to use it to find SHA-256 collisions. Specifically:(a) Implement Nivasch’s cycle-detection algorithm as described above. Notice, this algorithm detects acycle and calculates its length, but it does not determine where the cycle begins—which is necessaryto identify a specific collision. You must add a second phase to the algorithm that makes use of the2 6.857 : Handout 3: Problem Set 2cycle detection and length information reported from Nivach’s algorithm to find the beginning of thecycle and subsequently print out the colliding values.We expect you to implement the algorithm using Python. See the Resources page of the course website for a Python tutorial as well as some sample code to help you implement stacks and calculateSHA-256 values using the language. We expect your algorithm to work for the kthtruncated SHA-256hash function, h(k)(x), where we define h(k)(x) simply as the first k bits of SHA256(x). For example,h(“I hash therefore I am”) = ef57b2301c4539bdfe24bf96f14a91e215cc827b, therefore h(20)(“I hashtherefore I am”) = ef57b (each hex symbol is 4 bits). As mentioned, the SHA-256 function is readilyavailable using Python’s hashlib module. Reference the sample code on the Resources page to learnhow to use this built-in function—don’t try to implement SHA-256 on your own! Note: To useSHA-256 you need to be running Python version 2.5 (the first verison to include the hashlib library).Python 2.5 is easily installed on Windows, Mac, and linux machines—see the Resources page for theappropriate link.(b) Pick an arbitrary initial value and use your algorithm to discover kthtruncated SHA-256 collisionsfor various values of k. At least one of your k values should be large enough that a birthday attackwould have been unreasonable (say, k ≥ 24).(c) Turn in a concise explanation of the second-phase of your algorithm (i.e., how it finds where the cyclebegins), your source code, and your largest collision (i.e., the two colliding values for the largest valueof k you tried, as well as k itself). Describe the colliding values as hex strings in your write-up. Ifpossible, report how long it took you to find the collision, and describe the key specs of the machineyou ran your algorithm on.Problem 2-2. Security of Hash FunctionsNed Nerdle understands that if a hash function has n bits of output and also n bits of internal state (e.g. asdoes MD5 and SHA-1, for n = 128 and n = 160, respectively), then the time to find a collision is at most2n/2, and the time to invert is at most 2n.Ned decides to double the internal state size and output size to 2n bits, as follows, hoping to improvecollision-resistance to work 2nand the time to invert to 22n. His idea is keep the same compression functionf(s, m), but to change its usage. Here s is the n-bit input state variable, m is a 512-bit message block input,and f(s, m) is an n-bit compression function output.The state variable is now a pair (x, y) of n-bit variables. The initial state is c alled (x0, y0). The state afterprocessing k message blocks is called (xk, yk).


View Full Document

MIT 6 857 - Problem Set #2

Documents in this Course
Load more
Download Problem Set #2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set #2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set #2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?