Pollard’s rho algorithm

Pollard’s rho algorithm

Imagine that you are given a number ‘n’ and asked to find its factors. You know that this number is definitely not prime, so how do you find a divisor. You could go and try every single number that is smaller or equal to ‘n’ until you find a divisor. This is a very inefficient method to complete the task, and is impractical for very large composite numbers. Instead we have an algorithm that can do this in a much more efficient method – Pollard’s rho algorithm. In this article I will explain how this algorithm works, and more interestingly why it works.

First let us understand exactly what the algorithm does and what it can do. It takes in a composite number ‘n’ and it will return a single divisor of the number. The fact that it can only find one divisor may seem like a disadvantage at first, but we must understand that his means that this makes the algorithm particularly fast for large numbers with small factors, and we can find the complimentary divisor by using simple division. Of course the algorithm can also be used to find prime factors, but this will need some additional work, as we will have to keep the algorithm running until one of the divisors we have found is a prime number. One of the great achievements that Pollard’s rho algorithm is known for is the prime factorization of the eight Fermat number (2256 + 1), which resulted in:

1238926361552897*93461639715357977769163558199606896584051237541638188580280321.

The algorithm it not very complex in terms of it’s functionality, although the concepts behind why it works are complex and this is what I will explain a bit later on. Right now I would like to show an example of this algorithm in practice just so you get a feel for it:

Example: Let us assume that ‘n’ = 187 , and ‘d’ = 1. We shall also set ‘x’ = ‘y’ = 2, and ‘c’ = 1. This means that f(x) = x2 + 1 (mod 187)

Now let us begin the algorithm:

1. ‘x’ = f(x) = f(2) = 2x2 + 1 (mod 187) = 5

2. ‘y’ = f(f(y)) = f(f(2)) = f(5) = 5x5 + 1 (mod 187) = 26

3. ‘d’ = gcd(|5-26|, 187) = gcd(21, 187) = 1

4. Since ‘d’ = 1 we have to do another iteration

5. ‘x’ = f(x) = f(5) = 26

6. ‘y’ = f(f(y)) = f(f(26)) = 180

7. ‘d’ = gcd(|26=180|, 187) = gcd(154, 187) = 11

8. Since ‘d’ = 11 we have found a factor of 187.

We found that a factor of 187 is 11. To find the other factor we can simply divide 187 by 11, so 187/11 = 17. This means that we can find both factors by finding the smaller one.

Why it works:

This is where it gets really interesting. The proof behind this algorithm involves a lot of different concepts, and I will list them all below with a short explanation:

  1. The first concept is that of a greatest common divisor which is the biggest number that is a divisor of all the numbers that we are given

  1. Next is the fact that ‘x’ = ‘y’ (mod n) if, their absolute difference is a multiple of n, or each of them leaves the same remainder when divided by n. We say this by saying ‘x’ and ‘y’ and congruent modulo ‘n’

  2.  Third is the birthday paradox. This is a fairly well known paradox. It says that we only need 23 people so that the probability that at least 2 of them have a birthday have a birthday on the same ay exceeds 50%. To the right is one method to get this value 

 4. Last we have Floyd’s cycle finding algorithm. This basically says that if we have a tortoise an hare that start at the same point in a cycle, and the speed of the hare is twice that of the tortoise then they will definitely meet at some point.

Now we will discuss why the algorithm works. First let us consider the x terms that we get from our formula f(x) = x2 +c, as a sequence of numbers and then we can visualise it the sequence as dots, where the next dot is found by applying the function to the current dot. We get a peculiar pattern if we do this as shown below:

This is a random example and we have just plotted out all the possible values that ‘x’ or ‘y’ could take for this specific scenario. This figure is the same for all such diagrams made with the Pollard’s rho algorithm. As a side note this shape looks like rho, hence the algorithm is called Pollards rho algorithm.

We now have a sequence that we shall denote by {xi},
and now we shall make a new related sequence from this one where we mod every term by the factor ‘d’: {xi  mod d}, both sequences we know are going to repeat at some point. It is important to note that both sequences are pseudorandom sequences. So we can essentially treat the number like random numbers and sue the birthday paradox here that implies that the number of xi  terms before a repetition is O(√N) where N is the number of possible values. This can also be applied for our other sequence which we can see will repeat much earlier as ‘d’ < ‘n’, so there are less possible values for the mod sequence compared to the normal one. When we ae going through our mod sequence and find two such values ‘x’ = ‘y’ (mod d), then we know that the number |x- y| is a multiple of ‘d’, so form this we can conclude that we have found ‘d’. Hence our stop condition is gcd(|x-y|,n) != 1, as if it is not 1 then we can say that there is a repetition in the mod sequence, as ‘x’ and ‘y’ are congruent modulo ‘d’, so there difference is definitely a multiple of ‘d’. We can also say this will eventually happen using Floyd’s cycle finding algorithm, as ‘x’ only goes through the sequence on term at a time, whereas ‘y’ goes through the sequence two terms at a time, so its ‘speed’ is twice of x. So finally we can say that eventually the greatest common divisor is a divisor of n other than 1.

I hope that this was an interesting article on a useful algorithm, that you maybe did not know exist. Polland’s rho has a very extensive proof that I have tried my best to condense here, but I suggest that you go out and look at other sources to truly appreciate the ingenuity of this algorithm. Many of you may think that such algorithms are a waste of time compared to other more practical such as Bellman-Ford, or Dijkstra, however such algorithms provide an elegant bridge between maths and computer science, which help us solve problems that we could not otherwise solve such as the discrete algorithm problem, and such advancements will give us new ways to approach many problems in society that we have today that could possibly be solved through our increasing computational ability.