Berkeley MATH 55 - Supplementary notes on Algorithms for Factoring - D1748516

Home> Schools> University of California, Berkeley> Mathematics (MATH) > MATH 55> Supplementary notes on Algorithms for Factoring

DOC PREVIEW

Berkeley MATH 55 - Supplementary notes on Algorithms for Factoring

School name University of California, Berkeley

Course Math 55- Discrete Mathematics

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Math 55: Discrete Mathematics, Fall 2008Supplementary notes on Algorithms for FactoringIntroductionThe RSA public-key cryptography system relies on the idea that while it is easy to testwhether a large number is prime, and therefore easy to construct a number n that is theproduct of two large primes n = pq, there is no way known to efficiently find the factors ofsuch a number, and thus break the encryption scheme, if n is large.What constitutes “large?” It is quite practical to construct a product n = pq where theprimes p and q have 200 digits each, so n is a 400 digit number. It is also practical to performRSA encryption and decryption with such a large number n. The computations involvedtake fractions of a second on current computers.On the other hand, the current record for the largest number not of a special formfactored by a general purpose factoring algorithm is 200 digits. This record was achievedby German researchers Bahr, Boehm, Franke, and Kleinjung in 2005 using a cluster of 80computers running for over a year. Because the running times of the known algorithms areroughly exponential in the square root of the number of digits, factoring a 400 digit numberis currently many orders of magnitude more difficult than the record 200 digit number.In these notes we will discuss Pollard’s algorithm, a simple factoring algorithm based onthe Chinese remainder theorem, which usually takes a number of steps roughly proportionalto the square root of the smallest factor to be found. In particular, when the number to befactored is a product n = pq of two nearly equal primes, the number of steps required istypically proportional to the fourth root of n. Pollard’s algorithm is thus much better thantrial division, which takes√n steps. Using it on an current workstation computer, one cantypically factor a number of 20 digits or so in a few seconds. The algorithm is also practicalto use for hand computation with the aid of a calculator to factor numbers of about 6 digits.Pollard’s algorithmLet N denote the number we want to factor.We can first eliminate the possibility that N is prime using one of the standard primalitytests such as Miller’s test.We can also eliminate the case that N is a power of a prime by just checking for eachpossible power whether N = p2, whether N = p3, and so forth. This need only be done fora small number of possible powers (proportional to the logarithm of N).After these preliminary steps, we can now assume that N is composite and can be factoredas a product of two relatively prime integers, N = pq. Without loss of generality we can andwill assume that p is the smaller of the two factors.The basic principle behind Pollard’s algorithm (and also behind some of the more so-phisticated algorithms) is that when we perform any sequence of arithmetic operations onremainders (mod N ), the Chinese Remainder Theorem implies that this “secretly” has theeffect of performing the same operations (mod p) and (mod q) separately, even though wedo not yet know the factors p and q. Our plan will be to start with a random remainder x0(mod N) and from it compute a sequence of additional remainders xk(mod N) in such away that there is a high likelihood that after about√p steps, one of these remainders will,by chance, be divisible by p. At each step we will also compute gcd(xk, N). As soon as xkisdivisible by p, then, unless xkis zero, our gcd computation will discover the factor p of N.The other key ingredient in Pollard’s algorithm is the following. Suppose P is a finiteset with p elements and f : P → P is a random function from P to itself. Starting with anyelement y0∈ P , we can compute a sequence y1= f (y0), y2= f (y1), and so on. Since Pis finite, this sequence must eventually repeat, with some yj= yk. In fact, it can be shownthat for a randomly chosen function f, it is highly likely that the first k for which ykrepeatsa prior value will be roughly proportional to√p (for now, we will take this fact for granted,but we may justify it later when we discuss probability theory).An obvious but inefficient way to recognize when the sequence (yi) repeats is to store allvalues and compare each new one to the previous ones. However, there is a cleverer way:along with the yi’s, construct another sequence z0= y0, z1= f(f (z0)) = y2, z2= f(f (z1)) =y4that goes ‘two hops at a time’ through our original sequence yi. At each step we compareykwith zk= y2k. This works because once the sequence (yi) enters a loop, of length m, let’ssay, we will have yk= y2kas soon as m divides 2k − k = k.In Pollard’s algorithm, we assume that the function f (x) = x2+a, where a is a randomlychosen remainder, is a sufficiently random function that it is likely to repeat after about√psteps (it has not been proven that such a simple function must work well, but in practicepeople have observed that it seems to).The algorithm has the following steps.0. Eliminate the cases where N is prime or a power of a prime by checking forthis separately.1. Choose random remainders a and y0(mod N); set z0= y0.2. For i = 1, 2, . . ., computeyi= f(yi−1);zi= f(f (zi−1));xi= yi− zi;di= gcd(xi, N);where f(x) = x2+ a. If xi6≡ 0 (mod N ) and di6= 1, then diis a non-trivialfactor of N , so stop and return di. If xi≡ 0 (mod N), it means the sequence(yi) has gone into a loop modulo N without first finding a factor of N [thismight happen, although it is unlikely]. In that case, stop and return failure.Let’s notice a few things about the above algorithm. First of all, for clarity, I havelabelled the values computed at each step with a subscript i, but at step i the algorithmonly needs to use the values yi−1and zi−1. Therefore we need not store all the precedingvalues, but instead we can recycle a single pair of variables y and z.Another thing to notice is that the algorithm can fail. If this happens, we can just run itagain with new random starting values. It is extremely unlikely to fail repeatedly when runagain and again.A final thing to notice is that we did not set any limit in the algorithm on how manysteps i to take. It can’t run forever, since the yi’s must eventually repeat modulo N, but atworst, it might run for N steps (and then fail, besides). A desirable improvement over thealgorithm described above would be to set a limit in advance of something like 1004√n on thenumber of steps to be computed. Most of the time, the algorithm should find a factor beforereaching this

View Full Document