Quite a while ago, I presented the Collatz conjecture and I was then interested in the graphical representation of the problem—and not really going anywhere with it.

In this entry, let us have a look at the implementation of the Collatz function.

The Collatz recurrence is given by

Let us interest ourselves in a small modification of the function that returns the number of iterations needed to reach one. The function becomes

which is readily translated to C/C++ as

uint64_t collatz_naif(uint64_t x) { uint64_t c=0; while (x!=1) { c++; if (x%2) x=3*x+1; else x/=2; } return c; }

(Well, we could rewrite the `if` as `x=(x%2) ? 3*x+1 : x/2;` but that would neither improve performance nor readability.) Speaking of which: how can we optimize this function?

The first thing is to have a look at what the function actually does. If it iterates over an odd number, it yields an even number (because is always even if is odd). If it iterates over an even number it can yield either another even number or an odd number. Therefore we have that

odd → even,

but

even → even

or

even → odd.

So if an odd number always yield an even number; we might combine two steps into one if is odd: (and increment the counter twice). If is even, we could divide repeatedly by until we arrive at an odd number (and incrementing the counter once per iteration). We would get something like

uint64_t collatz_2steps(uint64_t x) { uint64_t c=0; while (x!=1) { if (x%2) { x=(3*x+1)/2; c+=2; } while (x%2==0) { x/=2; c++; } } return c; }

While we’re at it, if we can repeatedly divide by , why not iterate while we have an odd number?

uint64_t compressed_collatz(uint64_t x) { uint64_t c=0; while (x!=1) { while ((x&1)==1) x=(3*x+1)/2, c+=2; while ((x&1)==0) x/=2, c++; } return c; }

Of course, the above code assumes that the Collatz conjecture is *actually true*, which we think it is, but for which nobody ever came up with a definitive proof.

*

* *

Let us then compare each implementation quantitatively. The test is as follows: each version computes the number of iterations needed to reach one on the numbers from to . Time is computed in microseconds, and each point on the graph represent 1 million calls. The results are shown:

Note the sharp drop at in the “naif” version. All simulations ran in parallel (at the same time, using Boost threads) and the two fastest finished before the naif version and this allowed the CPU to kick in overclock mode (going from 3.4GHz to somewhere above 4GHz), and thus it messed up a bit the results.

So, comparing the algorithms, we see that both the 2-steps and the compressed versions are huge improvements over the naif implementation, yielding speedups in the order of 1.3:1, which is not bad. The compressed version is slightly faster than the 2-steps version. I also have a (C++-callable) assembly language implementation, but it’s not really worth the trouble; it’s insignificantly faster than the compressed version.

.file "asm-collatz.cpp" .text .globl _Z11asm_collatzm .type _Z11asm_collatzm, @function .intel_syntax noprefix # sets intel mode for all file _Z11asm_collatzm: xor rax,rax mov rdx,1 .l_while: cmp rdi,rdx je .l_exit .l_inner: test edi,edx jz .l_even ## here, must be odd lea rdi,[rdi+2*rdi+1] add rax,2 shr rdi,1 test edi,edx jnz .l_inner .l_even: shr rdi,1 inc rax test edi,edx jz .l_even jmp .l_while .l_exit: ret .size _Z11asm_collatzm, .-_Z11asm_collatzm .ident "" .section .note.GNU-stack,"",@progbits

*

* *

Solving the Collatz conjecture by brute-forcing the search for a counter-example is a bit foolhardy, methinks, especially that the lower bound for the first possible number creating a cycle (thus preventing the procedure from stopping) is thought to be incredibly large. Furthermore, only (comparatively small) numbers up to have been checked exhaustively.

[…] visited the Collatz conjecture a couple of times before. I’ve played with the problem a bit more to understand how attractors work in […]