Last week, we had a look at how g++ handles tail-recursion. Turns out it does a great job. One of the example used for testing the compiler was the factorial function, .

We haven’t pointed it out, but the factorial function in last week’s example computed the factorial modulo the machine-size (unsigned) integer. But what if we want to have the best possible estimation?

The factorial function overflows 32-bits unsigned integer when , which is fairly small. On 64-bits integers, it overflows at . We should probably go on to use arbitrary-precision integers, but we would also just use floating point, at least getting the general magnitude right. Let’s say doubles, so that all chances are on our side.

Let’s compare two version: one that implements the factorial using products, and another that implements the factorial using Stirling’s series.

The factorial using products:

double float_factorial(int n) { double s=1; for (int i=1;i<=n;i++) s*=i; return s; }

…is just what you’d expect. Stirling’s series is an exact asymptotic series for . The dominant term that starts the series is

Using this term alone, series becomes an approximation, and, alas! not a very good one a that. We must add more terms:

The numerators in this series are given by Sloane’s A001163 and the denominators by A001164. You can add as much as you want do get precision. I decided I’d go up to 10:

#include <cmath> double factorial(int n,int k) { double s=1; double nn=n; if (k>=2) { s+=1/(12*nn); } if (k>=3) { nn*=n; s+=1/(288 * nn); } if (k>=4) { nn*=n; s-=139/(51840 * nn); } if (k>=5) { nn*=n; s-=571/(2488320 * nn); } if (k>=6) { nn*=n; s+=163879/(209018880 * nn); } if (k>=7) { nn*=n; s+=5246819/(75246796800 * nn); } if (k>=8) { nn*=n; s-=534703531/(902961561600 * nn); } if (k>=9) { nn*=n; s-=4483131259/(86684309913600 * nn); } if (k>=10) { nn*=n; s+=432261921612371/(514904800886784000 * nn); } return std::sqrt(2 * M_PI * n)*std::pow(n/M_E,n)*s; }

*

* *

To compare, I used one version written in mathematica that does all the calculation symbolically until the very end were it is converted to double. This is as good as it gets. I also used the built-in factorial to obtain the exact, integer, value.

The product-based factorial does best when isn’t too large. It gives an exact result, but also around errors creep in, and the result is underestimated. At , it’s off by a factor of , the same order of magnitude of error than for the Stirling-based version. It’s not very good.

The two Stirling-based versions (c++, mathematica) errors compare entirely. In both cases, I started with absolute errors: exact minus the c++ version value, exact minus the mathematica version value. When you plot the ratios of the absolute errors in terms of and , the number of terms in the series, you find a graph like this one:

This hints (but does not demonstrates absolutely) that the rounding process isn’t that much of the problem. The truncation is.

*

* *

The main numerical problem with the factorial is its growth. It rapidly exceeds machine-size integers and floats, and the resulting truncation introduces very large errors. One possible work-around could be to keep the factorial as a symbolic structure as long as possible, allowing exact operations (in particular in conjunction with other factorials, as is necessary for the computation of the binomial coefficients). The most practical, from an implementation point-of-view is probably just to use arbitrary-sized integers. Or even better, avoid the factorial altogether.