The Euclidean algorithm, the algorithm to compute the greatest common divisor, or , of two numbers, dates back to antiquity (obviously). We can use it to make a fast test for primality, at least up to some confidence—and that’s where the inclusion-exclusion principle comes in.

Let us begin by the Euclidean algorithm. Originally, the algorithm was done by successive subtraction (and becomes quickly tedious as numbers grow), but the modern version, at least the one that I know of, uses divisions and remainders (modulo), and computes the of two numbers and in (counting division as an operation), which is *very* fast.

In C/C++, a minimal implementation of Euclid’s algorithm looks like:

unsigned gcd(unsigned a, unsigned b) { while (b) { unsigned t=b; b=a % b; a=t; } return a; }

It takes a while to figure out how Euclid’s algorithm works if we don’t approach it from the right angle. Maybe the easiest way of understanding it is recursively. If we have , then we also have , because whatever unknown factor (possibly 1) divides both and must also divide .

How do we use that for primality testing? Using we can only test whether numbers and share factors. If , then and are composite and share a factor. If they don’t, , but we can’t conclude any is prime or composite. They can be both composite but not sharing factors, like and . They need further testing.

We can use this to devise a Las Vegas algorithm. A Las Vegas algorithm is an algorithm that either answers “Yes, for sure” (with a proof) to a question, or “I don’t know”. We want an algorithm that answers “Yes, it’s composite” correctly the most often as possible, and answers “I don’t know” as rarely as possible (because further testing is more expensive). The key to get such an algorithm is to choose a good to test all s.

But how? Well, has to be a number with the greatest possible number of prime factors. In order words, we must chose , and go as far as we can in the series of prime numbers as `unsigned` (or any other number) we can go. With a 32-bits wide `unsigned`, we have . Larger than this, it overflows (adding , we arrive at , larger than ).

With defined, we have a Las Vegas type primality/composite testing algorithm.

*

* *

What about the confidence in the algorithm given by ? That’s where the inclusion-exclusion principle comes in. If we have only , we clearly have only half of the integer that test correctly—the even ones—and the other half that test inconclusively. If we have , we do not have of the numbers that test correctly, but only . Why ? We counted the numbers that are evenly divided by , evenly divided by , but counted the numbers evenly divided by twice, we have to remove them once.

For three factors, we already have

.

The method generalizes for any number of factors, but it gets very messy very rapidly.

From the first few (, , ) we have that , , of the numbers tests rapidly as composite. But the last two seem to indicate that the number of numbers that tests correctly does not increase very much when we add more factors. Let’s see:

factors | Portion tested correctly |

1 | 0.5 |

2 | 0.666667 |

3 | 0.733333 |

4 | 0.771429 |

5 | 0.792208 |

6 | 0.808192 |

7 | 0.819475 |

8 | 0.828976 |

9 | 0.836412 |

Graphically:

And as we add more and more factors, the accuracy grows (but never reaches one, as to reach one, we should have all primes—impossible, since there’s an infinity of them).

*

* *

So a Las Vegas algorithm to test for primality using this method is bound by the number of primes factors one can manipulate. Using 32 bits integers, we can only get as far as 9 factors before overflowing the product. With 64 bits, we could go much further, but the question is really how does computing compares with just successively testing with , then , then … ?

Let’s find out next time!

Reblogged this on Khuram Ali.

[…] Last time we had a look at using the greatest common divisor and Euclid’s algorithm to devise a Las Vegas algorithm for primality testing. We also had a look at how the inclusion exclusion principle helps us determine the proportion of the numbers correctly tested. […]

[…] in previous installments, we looked at how to use the euclidean algorithm to devise a Las Vegas-type algorithm […]