ln n!

In the course of analyzing an algorithm, I used the simplifying hypothesis that the cost function is

\displaystyle c(n)\approx\sum_{i_1}^n \lg i=\lg n!.

That expression is cumbersome but we can get a really good simplified function to use as a proxy. Let’s see how.

First, let’s start with Stirling’s approximation (which would be in fact due to de Moivre, according to Dutka):

\displaystyle n!=\sqrt{2\pi n}\left(\frac{n}{e}\right)^e\left(1+\frac{1}{12n}+\frac{1}{288n^2}-\frac{139}{51840n^3}+\cdots\right)

The terms of the expansion are given in A001163 and A001164 (in the On-Line Encyclopedia of Integer Sequences), but we needn’t worry too much about them, we’ll show that they can kind of be neglected anyway.

Let’s take logs on both sides:

\displaystyle \ln n! =\ln \sqrt{2\pi n} + n \ln \left(\frac{n}{e}\right) + \ln \left(1+\frac{1}{12n}+\cdots\right)

\displaystyle = \ln \sqrt{2\pi} + \frac{1}{2}\ln n+n\left(\ln n- \ln e\right) +\ln \left(1+\frac{1}{12n}+\cdots\right)

\displaystyle = \ln \left(n+\frac{1}{2}\right)\left(\ln n - 1\right)+\frac{1}{2}+ln \left(1+\frac{1}{12n}+\cdots\right).

Now, the term \displaystyle ln \left(1+\frac{1}{12n}+\cdots\right) is almost zero as soon as n is “kinda” large, and so can be neglected. We must also remark that the truncation has only an additive effect in \ln n!, while it would have had a multiplicative effect in the original n!. Therefore, we will lose precision, but only additively. So no biggie. Also, let’s reorder terms a bit:

\displaystyle \ln n!\approx \left(n+\frac{1}{2}\right)\left(\ln n -1\right)+\ln \sqrt{2\pi}+\frac{1}{2}.

Now, \ln \sqrt{2\pi}+\frac{1}{2}=1.4189\ldots\approx\sqrt{2}, with an error of about 0.00472\ldots, it’s not that bad of an approximation. Finally, we can write:

\displaystyle \ln n! \approx \left(n+\frac{1}{2}\right)\left(\ln n -1\right)+\sqrt{2}.

*
* *

So, how good is it? Let’s have a look. The first 20 show that the estimation undershoots the real value by about less than 1% as soon as n>10. The three columns are n, \ln n! and the approximation.

first-20

And the difference diminishes as n grows.

first-200

first-2000

*
* *

Therefore the truncation isn’t all that bad. In fact, in an algorithm analysis, we might even chop it down to \ln n!\approx(n+\frac{1}{2})(\ln n -1)+\frac{1}{2} without being too wrong. At that point, we can rewrite it as \ln n!\approx (n+\frac{1}{2})\ln n - n), which is probably the form easiest to manipulate.

References

Jacques Dutka — The Early History of the Factorial Function — Archive for History of Exact Sciences, vol. 45 no. 3 (sept. 1993) p. 225–249

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: