## Fractional Bits (Part I)

Some time ago, I discussed Huffman codes, how they were generated, and how you could (de)code information with it. I also said that they were optimal under various conditions, one of which (that I may or may not have mentioned) is that you have an integer number of bits.

Coding with an non-integer number of bits is counter-intuitive, but it is entirely possible to do so. There are in fact many ways to do so, but let’s start easy and ignore the frequency of occurrence of symbols for now.

Huffman codes construct a prefix-tree where leaves are symbols, and where the path between the root of the tree and a leaf is the code for that leaf, with, say, a zero each time you go down left and a one each time you go down right. This post explains it in detail, so I won’t repeat it here (but you’re encouraged to read it). The length of the code is proportional to $-\lg P(X=x)$, that is, it depends (non-linearly) on the probability of that symbol $x$ being observed given its random source $X$.

Often, we can ignore the probability of observation because we do not have a good idea of what it is (say, because we have a very limited number of observation) and we just assume that all symbols occur with equal probability. That’s very rarely the case in reality, but sometimes it makes life a lot easier. We therefore want to use $-\lg \frac{1}{n}$ bits, that is, $-\lg 1 + \lg n=\lg n$ bits to encode the value. That’s swell when $n\sim{}2^k$, that is, $n$ is a power of two, and $\lg n=k$. We therefore use exactly $k$ bits to encode the values, and problem’s solved.

But what if $n \not\sim 2^k$? One possibility is to use phase-in codes, that provides an average code length rather close to $\lg n$, even when $n\sim{}2^k+b$, that is, even if it’s not a power of two.

Phase-in codes are nice, but while their average length is close to $\lg n$, they sometimes fall somewhat further than what we would need. For example, if $n=3$, the average code length is $\frac{5}{3}=1.666$, while $\lg 3=1.58496\ldots$, with a difference smaller than $0.08607\ldots$, but that’s still 5% off.

Let’s pretend that it’s way too inefficient for our needs. Turns out that we can do something to essentially share the coding error between successive symbols and get a better coding efficiency. And the good thing, is that it’s simple. Really.

If we use an integer number of bits to code three distinct values, we have the intuition that $\lceil \lg 3 \rceil = 2$ bits are required, but we can get very close, in fact arbitrarily close to using $\lg 3$ per value. Here’s how. And let’s start simple, so you know where it’s from. Using an integer number of bits, you would write something like:

uint16_t encode_intbits(const int values[])
{
uint16_t encoded=0;
for (int i=0;i<8;i++)
{
encoded<<=2;
encoded|=values[i];
}
return encoded;
}

void decode_intbits(uint16_t encoded, int values[])
{
for (int i=0;i<8;i++)
{
values[7-i]=(encoded & 0x3);
encoded>>=2;
}
}


Which encodes 8 3-valued data onto 16 bits, using 16 bits rather than $5 \lg 3 \approx 12.6797$ bits, which is wasting about 20% of the bits. Now, what if rather than shifting by 2 bits, we shifted by $\lg 3$ bits? Well, why, yes it’s possible!

Let us rewrite the previous two functions as:

uint8_t encode_fractbits(const int values[])
{
uint8_t encoded=0;
for (int i=0;i<5;i++)
{
encoded*=3; // shifts by lg 3 bits!
encoded+=values[i];
}
return encoded;
}

void decode_fractbits(uint8_t encoded, int values[])
{
for (int i=0;i<5;i++)
{
values[4-i]=(encoded % 3); // masks by lg 3 bits
encoded/=3; // shifts by lg 3 bits
}
}


Why only five? because $3^5 \leqslant 2^8=256$, or more plainly: we can shift by $\lg 3$ only five times before exceeding 256. The inefficiency of that scheme is now about 1%, since we use $5 \lg 3$ bits out of 8, which is $\approx 7.9248$.

*
* *

The scheme generalizes to any number of values, of course. For example, on a byte, all (non-negative) composite numbers smaller than 256 represent a possible encoding configuration. The (thankfully) now defunct web safe palette of 216 colors is one such example. In this palette, you have 3 three 6-valued items to encode (one for red, one for green, and one for blue), that is, the color number (or code) is given by $36r+6g+b$, and it’s exactly the same kind of scheme we’ve been using before. One could observe that $6\times{}7\times{}6=252$ would have been a better palette, giving slightly more importance to the green, while still fitting on a byte.

### 4 Responses to Fractional Bits (Part I)

1. […] a previous post, we started discussing the subject of encoding values using fractional bits. This is quite […]

2. […] long time ago, we discussed the problem of using fractions of bits to encode, jointly, values without wasting […]

3. […] Coding, I began to think of the shortcomings of fuffman coding to be related to the problem of fractional bit-packing. That is, suppose you have 240 possible values for a symbol, and needed to encode this into bits, […]