Motivation§

The RSA algorithm relies on the fact that finding the factorization of a number is very difficult. The value $m = pq$ is given out as part of the public key in RSA; if one were able to compute $p, q$ quickly from $m$, then they will be able to break the RSA algorithm


Pollard’s $\rho$ Factorization Algorithm§

Pollard’s $\rho$

Suppose we know $m$ is composite with unknown prime factors with size of roughly $p$.

Then, let $k = \sqrt{p}$ and $u_0 = 1$.

We can compute $u_{i+1} = u_i^2 + 1 \pmod{m}$ for all $i \in [0\dots k]$.

Then, starting with $s = 1$ we compute

$$ d = \gcd(u_{2s}-u_s, m). $$

If $d \neq 1$, then $d$ is a non-trivial factor of $m$. Otherwise, increment $s$ by $1$ and repeat until a factor is found.

If we have exhausted our list of $u_i$’s, i.e. $2s > k$, then we can keep generating more values of $u_i$. The value of $k$ is a loose upper-bound on the problem.

This method relies on the idea that for a sufficiently large number of values $u_i$, one will likely have $u_i \equiv u_j \pmod{p}$ with high probability that the values $u_j \pmod{m}$ will be distinct.

There is no strict rule on how to choose values of $u_i$, however, it has been found that the aforementioned method works well in practice.

As for why the soft upper-bound of $\sqrt{p}$ works, we can refer to the following lemma.

Lemma 18.1

Suppose $1 \leq k \leq m$, and that the numbers $u_1, \dots, u_k$ are independently chosen from the set $\mathbb{N}_{\leq m}$, then the probability that the values $u_k$ are distinct is

$$ \left(1 - \frac{1}{m}\right)\left(1 - \frac{2}{m}\right)\dots\left(1 - \frac{k-1}{m}\right). $$

The number of all length $k$ sequences taken from $n$ values is $s = m^k$. The number of sequences where no value is repeated is $t = \prod_{i=0}^{k-1}\left(m-i\right)$. Then dividing $s$ by $t$ yields the result we found.

If we let $k\to\infty$, then we arrive at

$$ \Pr[u_k \text{ distinct}] \sim e^{-k^2/m}. $$

Therefore, if we choose $k \gt \sqrt{p}$ and the values follow a uniformly random distribution, we can expect a collision in $\mathbb{Z}_p$

As an aside, in the paper where Pollard gave this $\rho$ method, he also introduced his “kangaroo algorithm” for solving the discrete logarithm problem in any finite cyclic group. I just thought the name was amusing.