Factorization: Pollard's ρ\rho method

17 Dec 2021


§ Motivation

The RSA algorithm relies on the fact that finding the factorization of a number is very difficult. The value m=pqm = pq is given out as part of the public key in RSA; if one were able to compute p,qp, q quickly from mm, then they will be able to break the RSA algorithm

§ Pollard's ρ\rho Factorization Algorithm

Pollard's ρ\rho

Suppose we know mm is composite with unknown prime factors with size of roughly pp.

Then, let k=pk = \sqrt{p} and u0=1u_0 = 1.

We can compute ui+1=ui2+1(modm)u_{i+1} = u_i^2 + 1 \pmod{m} for all i[0k]i \in [0\dots k].

Then, starting with s=1s = 1 we compute

d=gcd(u2sus,m).d = \gcd(u_{2s}-u_s, m).

If d1d \neq 1, then dd is a non-trivial factor of mm. Otherwise, increment ss by 11 and repeat until a factor is found.

If we have exhausted our list of uiu_i's, i.e. 2s>k2s > k, then we can keep generating more values of uiu_i. The value of kk is a loose upper-bound on the problem.

This method relies on the idea that for a sufficiently large number of values uiu_i, one will likely have uiuj(modp)u_i \equiv u_j \pmod{p} with high probability that the values uj(modm)u_j \pmod{m} will be distinct.

There is no strict rule on how to choose values of uiu_i, however, it has been found that the aforementioned method works well in practice.

As for why the soft upper-bound of p\sqrt{p} works, we can refer to the following lemma.

Suppose 1km1 \leq k \leq m, and that the numbers u1,,uku_1, \dots, u_k are independently chosen from the set Nm\mathbb{N}_{\leq m}, then the probability that the values uku_k are distinct is

(11m)(12m)(1k1m).\left(1 - \frac{1}{m}\right)\left(1 - \frac{2}{m}\right)\dots\left(1 - \frac{k-1}{m}\right).

The number of all length kk sequences taken from nn values is s=mks = m^k. The number of sequences where no value is repeated is t=i=0k1(mi)t = \prod_{i=0}^{k-1}\left(m-i\right). Then dividing ss by tt yields the result we found.

If we let kk\to\infty, then we arrive at

Pr[uk distinct]ek2/m.\Pr[u_k \text{ distinct}] \sim e^{-k^2/m}.

Therefore, if we choose k>pk \gt \sqrt{p} and the values follow a uniformly random distribution, we can expect a collision in Zp\mathbb{Z}_p

As an aside, in the paper where Pollard gave this ρ\rho method, he also introduced his "kangaroo algorithm" for solving the discrete logarithm problem in any finite cyclic group. I just thought the name was amusing.