Euclid, Primes numbers, and a Bit of Algorithm Analysis

Last time we had a look at using the greatest common divisor and Euclid’s algorithm to devise a Las Vegas algorithm for primality testing. We also had a look at how the inclusion exclusion principle helps us determine the proportion of the numbers correctly tested.

turbo-napkin

However, we finished by asking ourselves if the method is actually efficient compared to, say, simply testing small divisors, one by one. Let us now find out.

A simple way of testing primality against a list of prime numbers is
implemented as follows:

bool alternate_test(unsigned a,const unsigned test_numbers[])
 {
  for (int i=0;test_numbers[i];i++)
   if ((a % test_numbers[i])==0)
    return false; // indubitable
  return true; // probable.
 }

where test_numbers is a null-terminated list of numbers. The procedures tests for the first, and exits if it is divisible by that number. If not, it tests the next one; and so on until 0 is reached, in which case we answer “probably, I don’t know for sure.” (Digression. In the past, I would have insisted for the function to have a single exit point, and would have formulated the loop as dual conditional, with a boolean found or something, and return found; at the end. However, the above violates the single exit point rule, but it is much more readable and its intend is made clearer: you return true only by default, and return false as soon as a match is found.) It is readily verified that the above function returns the same answers as the GCD-based test.

How do they compare speed-wise? Let us repeat last time’s test, where we use the list of factors up to 23 (otherwise it will overflow on 32 bits), and test the complete range from 0 to 2^{32}-1. The processor is first set at “performance” policy where Speed-Stepping is disabled and the processor runs full blast at about 4.3GHz (with turbo boost). So when we used GCD with 2, we use alternate_test (“loop” in the plot below) with {2,0}, and when we use GCD with 2\times{3}\times{5}\times{7} we use the other with {2,3,5,7,0}. The timings are as follows:

performance

Wow.

OK, what’s going on?

Well, let’s analyze. The GCD test uses progressively larger numbers to test against: 2, 6, 30, 210, 2310, 30030, 510510, 9699690, 223092870. If we hypothesize that the CPU does short-cut math depending on the size of the operands, we may suppose that computing % 210 is much faster than computing % 510510. That could account partly for the increase of complexity; while, in contrast, the alternate method uses 23 as the largest test number. But that’s not the major effect, I surmise. The alternate test exits after one test 50% of the times (the even numbers). It exits at the second test one third of one half of the times (multiples of 3 but not 2). The expected complexity is

C=\displaystyle  1+  (1-\frac{1}{2})  \left(1+(1-\frac{1}{3})  \left(1+(1-\frac{1}{5})  \left(1+(1-\frac{1}{7})  \left(1+ \cdots\right)\right)\right)\right),

and this series yields rather small numbers for factors up to 23. (How did I get this? By unrolling how the alternate algorithm proceeds. The algorithm does one iteration in all cases, with one factor. With two factors, it checks the first, and there’s a chance that it doesn’t stop there because the test failed; it continues another iteration. The probability that it continues after testing 2 is (1-\frac{1}{2}), because half of the numbers caused the algorithm to stop at the first iteration. Of the (1-\frac{1}{2}) remaining, an iteration cost of at least 1 is incurred (testing the next factor) and there’s a probability (compound, that’s why it’s a product form) of (1-\frac{1}{3}) that 3 did not divide it; yielding yet another iteration with probability (1-\frac{1}{2})(1-\frac{1}{3}) incurring cost of at least 1 to test for five, with the probability of (1-\frac{1}{5}) that it continues. You are encouraged to verify it does yield the above expression.) For the GCD method, I expect something like 1+\frac{1}{2}\log_2 d, where d is the divisor (the complexity of the GCD of two numbers a and b is O(\log(\min(a,b))).

Instrumenting the code allows us to verify this hypothesis:

unsigned gcd(unsigned a, unsigned b, uint64_t & count)
 {
  count=0;
  while (b)
   {
    count++;
    unsigned t=b;
    b=a % b;
    a=t;
   }

  return a;
 }

////////////////////////////////////////
bool alternative_test(unsigned a,
                      const unsigned test_numbers[],
                      uint64_t & count)
 {
  count=0;
  for (int i=0;test_numbers[i];i++)
   {
    count++;
    if ((a % test_numbers[i])==0)
     return false;
   }
  return true;
 }

We run the code and gather information. The average number of iterations for each method is as follows:

Nb
factors
GCD Alt
1 1.5 1
2 2.16667 1.5
3 3.36667 1.83333
4 4.81429 2.1
5 6.71385 2.32857
6 8.6944 2.53636
7 10.9449 2.72817
8 13.2974 2.9087
9 15.8115 3.07972

If we plot the above, we get

iterations

with the predicted number of iterations of 1+\frac{1}{2}\log_2 d for CGD shown in dotted line.

*
* *

At first, testing many divisors at the same time using Euclid’s algorithm for a Las Vegas type algorithm for primality testing seemed like a good idea. It’s elegant. It’s very number-theoretic. However, a bit of algorithmic analysis shows that the number of divisions made by the second algorithm is much smaller; the series 1+(1-\frac{1}{2})(1+(1-\frac{1}{3})(1+\cdots diverges, but it grows much slower than 1+\frac{1}{2}\log_2 d (as the graph above shows).

2 Responses to Euclid, Primes numbers, and a Bit of Algorithm Analysis

  1. […] in previous installments, we looked at how to use the euclidean algorithm to devise a Las Vegas-type algorithm for primality […]

  2. […] at using the GCD for devising a Las Vegas-type algorithm to test for the primality of a number in a previous entry, we saw that actually testing the number against a list of primes (therefore assuming we have an […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: