## The Large Hideous Collatzer

Mathematics is still full of surprises. The solution to simple to state problems may elude mathematicians for centuries. One example is the celebrated Fermat’s Last Theorem (stating that equations of the form $x^n+y^n=z^n$ have no integer-only solutions for $n > 2$) that was finally solved by Andrew Wiles with tools Fermat couldn’t possibly know nor understand1.

Another one of those problems is the Collatz Conjecture. Proposed by Lothar Collatz in 1937, the conjecture is that a simple recurrence function—that we will discuss in detail in just a moment—terminates for all natural numbers. This one, however, isn’t solved yet.

The Collatz recurrence is given by:

$C(n)=\begin{cases}1 & n=1\\C\left(\frac{1}{2}n\right) & n \mathrm{~is~even}\\C\left(3n+1\right) & \mathrm{otherwise}\end{cases}$

and the conjecture is that for any natural number $n$ you input, the recurrence eventually reaches 1 and terminates.

There are no known counter-examples, of course, and even if some people went to great extend to verify numerically the conjecture—some reaching numbers as large as $\sim 20\times{}2^{58}$ [1]—but a bunch of positive exemplars is merely an indication, not a proof, that the conjecture may indeed hold.

*
* *

It is interesting to look at the recurrence graphically. First, you write a Collatz sequence generator:

void collatz(int64_t n)
{
while ( (n!=1) && (c))
{
std::cout << n << "->";

if (n%2) // same as (n&1), the compiler knows that
n=3*n+1;  // if odd,
else
n/=2;     // else even

std::cout << n << std::endl;
}
}


You may want to deal with redundant edges, depending on your graphing software. For GraphViz, it is quite necessary to do so, because it will draw multiple arcs. Using GraphViz’s dot command, you can produce a DAG for the paths, in this case, for the first 20 numbers:

DAG for n=20

GraphViz automagically optimizes placement so that the graph does not self-intersect. Using the neato tool, the chains are placed in a radial pattern that maximizes distance between branches:

Neato for N=20

But the branching does not grow, even somewhat, regularly with increasing arguments. At $n=30$, we grow the following graph:

Neato for n=30

At $n=50$, already, GraphViz and GraphicsMagick suck up a lot of resources to produce the graph:

Neato for n=50

At $n=100$, it takes a while but yields:

Neato for n=100

At $n=500$, it still works, but it takes quite a while (and a lot of swapping):

Neato for n=500

There’s distinctly a fractal pattern to it. If I use Mathematica, on the other hand, I can create graphics for much larger $n$. In 2D, for $n=10000$, the “fractal” looks like this:

Mathematica Spring Electrical Optimization, n=10000

(Can you spot 1? It’s marked by a red Asterisk.) Because of Mathematica‘s rendering algorithm, the figure looks like a Diffusion-Limited Aggregation. In 3D, we get a similar display:

Mathematica Spring Electrical Optimization, 3D, n=10000

Where 1 is hidden somewhere in the largest cluster, on the left.

*
* *

Looking at the number of iterations the Collatz recurrence takes to reach 1 is also quite interesting. It seems regular:

Number of iterations to reach 1, n=10000

but it is not quite. While on the graphics it looks regular enough, it is in fact rather hard to guess beforehand the number of iterations it will take to reach 1. It’s, so to speak, almost quasi-random.

*
* *

The fractal-looking images drawn from the Collatz-sequences are visually interesting, but ultimately breaks the true structure of the problem. The placement of the nodes are merely an artifact of the rendering algorithms used, and not something intrinsic to the Collatz problem. Already using the confusing but more accurate linear embedding:

Linear Embedding, n=100

tells a lot more about the internal structure of the recurrence.

I have a growing interest for data visualization and I realize just how important choosing the right representation is, not only to transmit your ideas about the data but also to actually understand what is going on in the data. In the Collatz case (which I do not intend to solve anytime soon), displaying the recurrence chains that (all seems to) eventually lead to 1 is a fun exercise; it may show pretty fractal-like patterns, but it doesn’t help to understand the structure of the problem because the very way it is graphed destroys that structure!

I think the same kind of reflection must occur when we present something, whether experimental or theoretical, so that our graphs not only do not destroy important information about the results but also manage to amplify the structure so that something that is not immediately self-evident becomes conspicuous.

More thinking is needed on this topic, I guess.

1 Fermat was a smart guy, but also quite the smart ass. He spent his life taunting his mathematician friends with puzzles and riddles, sometimes claiming he had a solution when he had none. The last theorem is the most amusing example of his smartassery. In 1637, in the margin of a book on by Diophantus, the Arithmetica he wrote that

Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et generaliter nullam in infinitum ultra quadratum potestatem in duos eiusdem nominis fas est dividere cuius rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.

which can be translated by something like I know of a quite fantastic proof that it’s not possible to split a cube into the sum of two cubes, and same for fourth powers, and for any higher power. But this margin is too narrow for me to write it down. Of course, we now know that Fermat must have been bluffing, or maybe that he thought, mistakenly, that he had a real proof. Doubts lasted for centuries, but I think that Wiles’ work disproves that he had a complete solution.

References

[1] Tomás Oliveira e Silva — Maximum Excursion and Stopping Time Record Holders for the 3x+1 Problem: Computational Results — Mathematics of Computation, v68 n⁰ 225 (1999) p.371–384

### 5 Responses to The Large Hideous Collatzer

1. Nice post, I ran across it when WordPress in its infinite wisdom linked to it from my Collatz post.

Could you elaborate on Linear Embedding, I don’t understand how to read the graph.

• Steven Pigeon says:

The x-axis is the natural order of numbers, from 0 to n; arcs are trajectories from n to 3n+1 or n/2, depending on n. I couldn’t get Mathematica to order them properly; so the x-axis is a bit mixed-up (probably tries to minimize total arc length or something).

• That’s what I figured but it didn’t make sense to me that 1 is about 2/5 the way between 0 and 100….

it still doesn’t…

2. [...] a while ago, I presented the Collatz conjecture and I was then interested in the graphical representation of [...]

3. just me says:

Has anyone tried plotting the lines y=2x, y=3x+1, y=4x, etc, then following orbits? Example: start at x=11, vertical to y=34 on y=3x+1, horizontal to right at x=17 and y=2x, vertical from x=17 to y=3x+1 (=52=4times13), then horizontal left to x=13 and y=4x, then down to y= 3x+1 (=40=8times5), then left again to x=5 and y=8x, etc

You can see orbits that way, also you can see that an orbit does not backtrack on itself for 3x+1 (but of course I don’t know how to prove it). If you try the same for 3x-1 or 5x+1, you can see cycles.