Faster Collatz

Quite a while ago, I presented the Collatz conjecture and I was then interested in the graphical representation of the problem—and not really going anywhere with it.

In this entry, let us have a look at the implementation of the Collatz function.

The Collatz recurrence is given by

C(n)=\begin{cases}1 & n=1\\C(\frac{1}{2}n) & n~\mathrm{even}\\C(3n+1)&\mathrm{otherwise}\end{cases}

Let us interest ourselves in a small modification of the function that returns the number of iterations needed to reach one. The function becomes

C(n)=\begin{cases}0 & n=1\\1+C(\frac{1}{2}n) & n~\mathrm{even}\\1+C(3n+1)&\mathrm{otherwise}\end{cases}

which is readily translated to C/C++ as

uint64_t collatz_naif(uint64_t x)
 {
  uint64_t c=0;

  while (x!=1)
   {
    c++;
    if (x%2)
     x=3*x+1;
    else
     x/=2;
   }
  return c;
 }

(Well, we could rewrite the if as x=(x%2) ? 3*x+1 : x/2; but that would neither improve performance nor readability.) Speaking of which: how can we optimize this function?

The first thing is to have a look at what the function actually does. If it iterates over an odd number, it yields an even number (because 3x+1 is always even if x is odd). If it iterates over an even number it can yield either another even number or an odd number. Therefore we have that

odd → even,

but

even → even

or

even → odd.

So if an odd number always yield an even number; we might combine two steps into one if x is odd: x=(3x+1)/2 (and increment the counter twice). If x is even, we could divide repeatedly by 2 until we arrive at an odd number (and incrementing the counter once per iteration). We would get something like

uint64_t collatz_2steps(uint64_t x)
 {
  uint64_t c=0;

  while (x!=1)
   {
    if (x%2)
     {
      x=(3*x+1)/2;
      c+=2;
     }

    while (x%2==0)
     {
      x/=2;
      c++;
     }
   }
  return c;
 }

While we’re at it, if we can repeatedly divide by 2, why not iterate while we have an odd number?

uint64_t compressed_collatz(uint64_t x)
 {
  uint64_t c=0;

  while (x!=1)
   {
    while ((x&1)==1) x=(3*x+1)/2, c+=2;
    while ((x&1)==0) x/=2, c++;
   }
  return c;
 }

Of course, the above code assumes that the Collatz conjecture is actually true, which we think it is, but for which nobody ever came up with a definitive proof.

*
* *

Let us then compare each implementation quantitatively. The test is as follows: each version computes the number of iterations needed to reach one on the numbers from 1 to 10^9. Time is computed in microseconds, and each point on the graph represent 1 million calls. The results are shown:

(Click to embiggen)

Note the sharp drop at \approx 800\:000\:000 in the “naif” version. All simulations ran in parallel (at the same time, using Boost threads) and the two fastest finished before the naif version and this allowed the CPU to kick in overclock mode (going from 3.4GHz to somewhere above 4GHz), and thus it messed up a bit the results.

So, comparing the algorithms, we see that both the 2-steps and the compressed versions are huge improvements over the naif implementation, yielding speedups in the order of 1.3:1, which is not bad. The compressed version is slightly faster than the 2-steps version. I also have a (C++-callable) assembly language implementation, but it’s not really worth the trouble; it’s insignificantly faster than the compressed version.

        .file	"asm-collatz.cpp"
        .text
        .globl	_Z11asm_collatzm
        .type	_Z11asm_collatzm, @function

        .intel_syntax noprefix  # sets intel mode for all file

_Z11asm_collatzm:

        xor rax,rax
        mov rdx,1

.l_while:
        cmp rdi,rdx
        je  .l_exit

.l_inner:
        test edi,edx
        jz   .l_even

        ## here, must be odd
        lea rdi,[rdi+2*rdi+1]
        add rax,2
        shr rdi,1

        test edi,edx
        jnz  .l_inner

.l_even:
        shr rdi,1
        inc rax

        test edi,edx
        jz .l_even

        jmp .l_while


.l_exit:
        ret

        .size	_Z11asm_collatzm, .-_Z11asm_collatzm
        .ident	""
        .section	.note.GNU-stack,"",@progbits

*
* *

Solving the Collatz conjecture by brute-forcing the search for a counter-example is a bit foolhardy, methinks, especially that the lower bound for the first possible number creating a cycle (thus preventing the procedure from stopping) is thought to be incredibly large. Furthermore, only (comparatively small) numbers up to \approx2^{58} have been checked exhaustively.

One Response to Faster Collatz

  1. […] visited the Collatz conjecture a couple of times before. I’ve played with the problem a bit more to understand how attractors work in […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: