Mathematics is still full of surprises. The solution to simple to state problems may elude mathematicians for centuries. One example is the celebrated Fermat’s Last Theorem (stating that equations of the form have no integer-only solutions for ) that was finally solved by Andrew Wiles with tools Fermat couldn’t possibly know nor understand^{1}.

Another one of those problems is the Collatz Conjecture. Proposed by Lothar Collatz in 1937, the conjecture is that a simple recurrence function—that we will discuss in detail in just a moment—terminates for all natural numbers. This one, however, isn’t solved yet.

The Collatz recurrence is given by:

and the conjecture is that for any natural number you input, the recurrence eventually reaches 1 and terminates.

There are no known counter-examples, of course, and even if some people went to great extend to verify numerically the conjecture—some reaching numbers as large as [1]—but a bunch of positive exemplars is merely an indication, not a proof, that the conjecture may indeed hold.

*

* *

It is interesting to look at the recurrence graphically. First, you write a Collatz sequence generator:

void collatz(int64_t n) { while ( (n!=1) && (c)) { std::cout << n << "->"; if (n%2) // same as (n&1), the compiler knows that n=3*n+1; // if odd, else n/=2; // else even std::cout << n << std::endl; } }

You may want to deal with redundant edges, depending on your graphing software. For GraphViz, it is quite necessary to do so, because it will draw multiple arcs. Using GraphViz’s `dot` command, you can produce a DAG for the paths, in this case, for the first 20 numbers:

GraphViz automagically optimizes placement so that the graph does not self-intersect. Using the `neato` tool, the chains are placed in a radial pattern that maximizes distance between branches:

But the branching does not grow, even somewhat, regularly with increasing arguments. At , we grow the following graph:

At , already, GraphViz and GraphicsMagick suck up a lot of resources to produce the graph:

At , it takes a while but yields:

At , it still works, but it takes quite a while (and a lot of swapping):

There’s distinctly a fractal pattern to it. If I use *Mathematica*, on the other hand, I can create graphics for much larger . In 2D, for , the “fractal” looks like this:

(Can you spot 1? It’s marked by a red Asterisk.) Because of *Mathematica*‘s rendering algorithm, the figure looks like a Diffusion-Limited Aggregation. In 3D, we get a similar display:

Where 1 is hidden somewhere in the largest cluster, on the left.

*

* *

Looking at the number of iterations the Collatz recurrence takes to reach 1 is also quite interesting. It seems regular:

but it is not quite. While on the graphics it looks regular enough, it is in fact rather hard to guess beforehand the number of iterations it will take to reach 1. It’s, so to speak, almost quasi-random.

*

* *

The fractal-looking images drawn from the Collatz-sequences are visually interesting, but ultimately breaks the true structure of the problem. The placement of the nodes are merely an artifact of the rendering algorithms used, and not something intrinsic to the Collatz problem. Already using the confusing but more accurate linear embedding:

tells a lot more about the internal structure of the recurrence.

I have a growing interest for data visualization and I realize just how important choosing the right representation is, not only to transmit your ideas about the data but also to actually understand what is going on in the data. In the Collatz case (which I do not intend to solve anytime soon), displaying the recurrence chains that (all seems to) eventually lead to 1 is a fun exercise; it may show pretty fractal-like patterns, but it doesn’t help to understand the structure of the problem because the very way it is graphed *destroys* that structure!

I think the same kind of reflection must occur when we present something, whether experimental or theoretical, so that our graphs not only do not destroy important information about the results but also manage to amplify the structure so that something that is not immediately self-evident becomes conspicuous.

More thinking is needed on this topic, I guess.

^{1} Fermat was a smart guy, but also quite the smart ass. He spent his life taunting his mathematician friends with puzzles and riddles, sometimes claiming he had a solution when he had none. The last theorem is the most amusing example of his smartassery. In 1637, in the margin of a book on by Diophantus, the *Arithmetica* he wrote that

Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et generaliter nullam in infinitum ultra quadratum potestatem in duos eiusdem nominis fas est dividere cuius rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.

which can be translated by something like *I know of a quite fantastic proof that it’s not possible to split a cube into the sum of two cubes, and same for fourth powers, and for any higher power. But this margin is too narrow for me to write it down*. Of course, we now know that Fermat *must* have been bluffing, or maybe that he thought, mistakenly, that he had a real proof. Doubts lasted for centuries, but I think that Wiles’ work disproves that he had a complete solution.

**References**

[1] Tomás Oliveira e Silva — *Maximum Excursion and Stopping Time Record Holders for the 3x+1 Problem: Computational Results* — Mathematics of Computation, v68 n⁰ 225 (1999) p.371–384

Nice post, I ran across it when WordPress in its infinite wisdom linked to it from my Collatz post.

Could you elaborate on Linear Embedding, I don’t understand how to read the graph.

The x-axis is the natural order of numbers, from 0 to n; arcs are trajectories from n to 3n+1 or n/2, depending on n. I couldn’t get Mathematica to order them properly; so the x-axis is a bit mixed-up (probably tries to minimize total arc length or something).

That’s what I figured but it didn’t make sense to me that 1 is about 2/5 the way between 0 and 100….

it still doesn’t…

[…] a while ago, I presented the Collatz conjecture and I was then interested in the graphical representation of […]

Has anyone tried plotting the lines y=2x, y=3x+1, y=4x, etc, then following orbits? Example: start at x=11, vertical to y=34 on y=3x+1, horizontal to right at x=17 and y=2x, vertical from x=17 to y=3x+1 (=52=4times13), then horizontal left to x=13 and y=4x, then down to y= 3x+1 (=40=8times5), then left again to x=5 and y=8x, etc

You can see orbits that way, also you can see that an orbit does not backtrack on itself for 3x+1 (but of course I don’t know how to prove it). If you try the same for 3x-1 or 5x+1, you can see cycles.

[…] these limitations, Graphviz is a really great tool I have used before and will use […]

[…] visited the Collatz conjecture a couple of times before. I’ve played with the problem a bit more to understand how attractors work in the Collatz […]