## ln n!

In the course of analyzing an algorithm, I used the simplifying hypothesis that the cost function is

$\displaystyle c(n)\approx\sum_{i_1}^n \lg i=\lg n!$.

That expression is cumbersome but we can get a really good simplified function to use as a proxy. Let’s see how.

First, let’s start with Stirling’s approximation (which would be in fact due to de Moivre, according to Dutka):

$\displaystyle n!=\sqrt{2\pi n}\left(\frac{n}{e}\right)^e\left(1+\frac{1}{12n}+\frac{1}{288n^2}-\frac{139}{51840n^3}+\cdots\right)$

The terms of the expansion are given in A001163 and A001164 (in the On-Line Encyclopedia of Integer Sequences), but we needn’t worry too much about them, we’ll show that they can kind of be neglected anyway.

Let’s take logs on both sides:

$\displaystyle \ln n! =\ln \sqrt{2\pi n} + n \ln \left(\frac{n}{e}\right) + \ln \left(1+\frac{1}{12n}+\cdots\right)$

$\displaystyle = \ln \sqrt{2\pi} + \frac{1}{2}\ln n+n\left(\ln n- \ln e\right) +\ln \left(1+\frac{1}{12n}+\cdots\right)$

$\displaystyle = \ln \left(n+\frac{1}{2}\right)\left(\ln n - 1\right)+\frac{1}{2}+ln \left(1+\frac{1}{12n}+\cdots\right)$.

Now, the term $\displaystyle ln \left(1+\frac{1}{12n}+\cdots\right)$ is almost zero as soon as $n$ is “kinda” large, and so can be neglected. We must also remark that the truncation has only an additive effect in $\ln n!$, while it would have had a multiplicative effect in the original $n!$. Therefore, we will lose precision, but only additively. So no biggie. Also, let’s reorder terms a bit:

$\displaystyle \ln n!\approx \left(n+\frac{1}{2}\right)\left(\ln n -1\right)+\ln \sqrt{2\pi}+\frac{1}{2}$.

Now, $\ln \sqrt{2\pi}+\frac{1}{2}=1.4189\ldots\approx\sqrt{2}$, with an error of about $0.00472\ldots$, it’s not that bad of an approximation. Finally, we can write:

$\displaystyle \ln n! \approx \left(n+\frac{1}{2}\right)\left(\ln n -1\right)+\sqrt{2}$.

*
* *

So, how good is it? Let’s have a look. The first 20 show that the estimation undershoots the real value by about less than 1% as soon as $n>10$. The three columns are $n$, $\ln n!$ and the approximation.

And the difference diminishes as $n$ grows.

*
* *

Therefore the truncation isn’t all that bad. In fact, in an algorithm analysis, we might even chop it down to $\ln n!\approx(n+\frac{1}{2})(\ln n -1)+\frac{1}{2}$ without being too wrong. At that point, we can rewrite it as $\ln n!\approx (n+\frac{1}{2})\ln n - n)$, which is probably the form easiest to manipulate.

References

Jacques Dutka — The Early History of the Factorial Function — Archive for History of Exact Sciences, vol. 45 no. 3 (sept. 1993) p. 225–249