## Pruning Collatz, somewhatz

I’ve visited the Collatz conjecture a couple of times before. I’ve played with the problem a bit more to understand how attractors work in the Collatz problem.

So what is the Collatz conjecture? The conjecture is that the function

$C(n)= \begin{cases} 1&\text{if}n=1\\C(\frac{1}{2}n)&\text{if }n\text{ is even}\\ C(3n+1)&\text{otherwise} \end{cases}$

Always reaches one. But the catch is, nobody actually figured out how to prove this. It seems to be true, since for all the $n$ ever tried the function eventually reached 1. While it is very difficult to show that it reaches 1 every time, it’s been observed that there are chains, let’s call them attractors, such what when you reach a number within that chain, you go quickly to 1. One special case is the powers-of-two attractor: whenever you reach a power of two, it’s a downward spiral to 1. Can we somehow use this to speed up tests?

The first thing is to see how long the power-of-two chains are. A quick Mathematica program reveals, from the 100M first numbers, the following histogram:

So basically the attractor always reach 16 first, then 1024, then 256. Of course, if it reaches 16384 first, it will run though all the powers of two until 1 is reached, but what’s interesting is that it reached 16384 first. Or 16. Especially 16, in fact.

Would prune the search whenever a power of two is reached give a speed-up?

So before we start we need an efficient method of testing whether or not a number is a power of two. Fortunately, it’s not much harder than checking if a number is even:

bool is_even(unsigned x) { return (x&1)==0; }
bool is_power2(unsigned x) { return (x&(x-1))==0; }


A naïve, non-instrumented, version of the Collatz function could look something like:

unsigned collatz(unsigned x)
{
if (x==1)
return 1;
else
if (is_even(x))
return collatz(x/2);
else
return collatz(3*x+1);
}


This version implements directly the equation above. Using the idea of pruning whenever a power of two is reached would yield the following implementation:

unsigned collatz_prune(unsigned x)
{
if (x==1)
return 1; // done
if (is_even(x))
if (is_power2(x))
return 1; // done, attractor
else
return collatz_prune(x/2);
else
return collatz_prune(3*x+1);
}


But testing for a power-of-two looks expensive. Couldn’t we just prune the recursion when 16 is reached? 16 is a good choice if we look at the histogram from our previous experiment.

unsigned collatz_prune_16(unsigned x)
{
if ((x==1) || (x==16))
return 1; // done, attractor 16
if (is_even(x))
return collatz_prune_16(x/2);
else
return collatz_prune_16(3*x+1);
}


But we also observed in a previous post that an odd number would be transformed into an even number (that’s what $3x+1$ will do) to be be divided again! So we can “compress” the sequence by computing $(3x+1)/2$ as the next step, compressing two calls into one.

unsigned collatz_prune_compressed(unsigned x)
{
if (x==1)
return 1; // done
if (is_even(x))
if (is_power2(x))
return 1; // done, attractor
else
return collatz_prune_compressed(x/2);
else
return collatz_prune_compressed((3*x+1)/2);
}

unsigned collatz_prune_16_compressed(unsigned x)
{
if ((x==1) || (x==16))
return 1; // done, attractor 16
if (is_even(x))
return collatz_prune_16_compressed(x/2);
else
return collatz_prune_16_compressed((3*x+1)/2);
}


*
* *

So, let’s what works, and what doesn’t. Gluing everything together lets us gather timings and speed-ups. For the 100M first numbers:

 Method Time, s Speed-up Naïve 24.72 1.0 Pruned 21.20 1.17:1 Pruned on 16 22.86 1.08:1 Pruned, compressed 21.37 1.16:1 Pruned on 16, compressed 19.25 1.28:1

So checking for all powers of two gives a better speed-up than only checking against 16, which is surprising. Why? Well, 16 makes up the vast majority of the first power of two reached by the recursion, and testing against a constant should be faster than computing (x&(x-1))==0. But that’s not the case. This means that others powers of two count for a lot more than first thought (a “fat tail” of sorts) and offset the cost of testing power-of-twoness.

However, “compressing” the recurrence seems to jump over one power of two once in a while and somewhat reduce the speed-up. However, “compressing” the recurrence works well with just checking against 16, because now the fat tail is offset. That’s unexpected.

*
* *

This, of course, doesn’t shed much light on how to solve the Collatz conjecture, but tells us that sometimes even a cursory examination of the behavior of a function can lead to an understanding that allows us to modify the function to gain a speed-up. Here, the speed-up is not that impressive, about 30%, not four-fold, but is a speed-up nonetheless.

### 5 Responses to Pruning Collatz, somewhatz

1. Richard says:

Quick question: does your compiler perform tail-optimisation, effectively transforming recursion into a loop? If not then you’ve got a bit performance gain in doing so.

• I don’t think it does, except maybe for the very simplest cases. Recursion and tail-recursion are not a primary design pattern in languages like C and C++. It’d be great if they did. This being said, I should investigate that.

• After investigation, G++ does something, but that’s not quite tail-recursion elimination. It generates code that just calls back the function, but instead of actually call it (and implement recursion), it generates a hard jump to the begining of the function. I guess it figured out that there is only the argument as variable and can afford that optimization

2. Eduard Gode says:

A more common way to optimize the calculation by using known results is building a table for many results and use them for small numbers.
Splitting a number into a low bits and a high bits part, the low bits may be used to access a table of bits -> { i, f, d } with i = number of iterations, f = factor, d = difference. The result of collatz( h*2^n + l ) = i + collatz( h*f + d ); with collatz(x) returning the number of iterations. Doing this n iteration steps can done in one single step. Best results on my system are n = 16 for calculating collatz(x) up to x = 4*10^9 with a perl implementation.

use constant B_SHIFT =&gt; 18;
use constant B_TEST  =&gt; 1 &lt;<B> B_TEST - 1;
my @bt;
my @bs;
sub collatz($) { my$x = shift;
my $z = 0; use integer; while ($x != 1 ) {
my $b0 =$x &amp; B_MASK;
my $b1 =$x &gt;&gt; B_SHIFT;
$_ =$b1 ? $bt[$b0] : $bs[$b0];
my $bs; if ( !defined$_ ) {
my ( $tz,$tb, $tf ) = ( 0,$b0, 1 );
for ( my $i = 0;$i <B>&gt;1) * 3 + 2; $tf *= 3; } else {$tz += 1; $tb = ($tb&gt;&gt;1);                   }
$bs = [$tz, $tf,$tb ] if 1 == $tb &amp;&amp; !defined$bs;
}
$bt[$b0] = [ $tz,$tf, $tb ];$bs[$b0] =$bs // $bt[$b0];
$_ =$b1 ? $bt[$b0] : $bs[$b0];
}

#print "x = $x, b0 =$b0, b1 = $b1,$_-&gt;[0], $_-&gt;[1],$_-&gt;[2]\n" if !$b1;$z += $_-&gt;[0];$x  = $b1 *$_-&gt;[1] + $_-&gt;[2]; } return$z;
}

• (I took the liberty of editing your post to include sourcecode language=”perl” so that your code would display correctly)

What is the magnitude of the speed up compared to a naïve implementation?