Phase-in Codes

02/09/2008

Once in a while, we have to encode a series of values to memory or file, of which we know either very little about, distribution-wise, or are very close to being drawn from an uniform random variable: basically, they’re all equally likely to be drawn. This precludes the use of Huffman (or like) codes because the distribution is flat and Huffman code will afford no real compression despite the added complexity.

We do not want to use a fixed number of bits either because the number of possible values might be very far from the next power of two. For example say I have 11 different values. I could encode them on 4 bits each, but I would waste more than half a bit by doing so. Half a bit wasted for every code. Fortunately, there’s an easy, fast encoding that can help us achieve high efficiency: phase-in codes.

Read the rest of this entry »