Computing Binomials (Part I)

We often start thinking about something, make some progress, and eventually realize it’s not going anywhere, or, anyway, the results aren’t very phrasmotic—certainly not what you hoped for. Well, this week’s post is one of those experiments.

So I was wondering, how can I compute the binomial coefficient, $\binom{n}{k}$, and minimize the size of the integers involved?

Let’s first define the binomial coefficient. In its simplest form:

$\displaystyle \binom{n}{k}=\frac{n!}{k!(n-k)!}$

where $n!$ denotes the factorial:

$n!=n \cdot (n+1) \cdot (n-2) \cdots 3 \cdot 2 \cdot 1$

If we proceed directly with the formulation, we compute $n!$, a potentially huge number (it is $O(2^{n \log_2 n})$ !), then $k!$, $(n-k)!$, and finally perform the division. Computing $n!$ directly performs $n$ multiplications with increasing large integers. We can assume that operations on machine-size integers have constant-time cost, but we can’t do that with large integers. Even with a fast multiplication algorithm, say Karatsuba’s or Toom-Cook, that cost grows rapidly. So I figured that since some terms cancel out, we might be able to reduce the number of multiplies and the maximum size of the integers involved.

Let’s have a look at what this thing actually computes.

$\displaystyle\binom{n}{k}=\frac{n!}{k!(n-k)!}=\frac{1 \cdot 2 \cdot 3 \cdots (n-1) \cdot n}{1 \cdot 2 \cdot 3 \cdots (k-1) \cdot k \cdot 1 \cdot 2 \cdots (n-k-1)\cdot (n-k)}$

which becomes, after canceling a few terms:

$\displaystyle =\frac{(k+1)\cdot(k+2)\cdots(n-1)\cdot(n)}{1 \cdot 2 \cdots (n-k-1) \cdot (n-k)}$.

Where the numerator and the denominator have the same number of terms, $n-k$. We can rewrite the above as:

$\displaystyle = \prod_{i=1}^{n-k} \frac{i+k}{i} =\prod_{i=1}^{n-k}\left(1+\frac{k}{i}\right)$

or, if we insist in using integers only:

$\displaystyle \frac{\prod_{i=1}^{n-k}i+k}{\prod_{i=1}^{n-k} i}$.

(We can always have less than $\frac{1}{2}n$ terms because we can use the identity $\binom{n}{m}=\binom{n}{n-m}$ to keep $n-k$ in the product formula as small as possible!)

*
* *

So the number of multiplications passes from $n+(n-k)+k =2n$ (regardless of $k$) to $\min(k,n-k)$, which is a huge improvement. Further, the largest number computed goes from $n!$ to $n!/\min(k,n-k)!$, an improvement as well.

If we take the formulas

$\displaystyle \binom{n}{k} = \prod_{i=1}^{n-k} \frac{i+k}{i} = \prod_{i=1}^{n-k}\left(1+\frac{k}{i}\right)$

to compute an approximation to $\binom{n}{k}$ using just floats, how does it compare to the native identity? Certainly using smaller numbers also implies smaller errors? Well, I do not know yet.

To be continued…