8-bit Audio Companding

February 7, 2017

Computationally inexpensive sound compression is always difficult, at least if you want some quality. One could think, for example, that taking the 8 most significant bits of 16 bits will give us 2:1 (lossy) compression but without too much loss. However, cutting the 8 least significant bits leads to noticeable hissing. However, we do not have to compress linearly, we can apply some transformation, say, vaguely exponential to reconstruct the sound.


That’s the idea behind μ-law encoding, or “logarithmic companding”. Instead of quantizing uniformly, we have large (original) values widely spaced but small (original) value, the assumption being that the signal variation is small when the amplitude is small and large when the amplitude is great. ITU standard G.711 proposes the following table:

Read the rest of this entry »

Fractional Bits (Part III)

October 18, 2016

A long time ago, we discussed the problem of using fractions of bits to encode, jointly, values without wasting much bits. But we considered then only special cases, but what if we need to shift precisely by, say, half a bit?

But what does shifting by half a bit even means?

Read the rest of this entry »

Of Colorspaces and Image Compression (Part III)

November 11, 2014

Last week, we continued on color spaces, and we showed—although handwavingly—that we’re not very good at seeing color differences in hue and saturation, and that we’re only good at seeing difference in brightness. In this last part, we’ll have a look at how JPEG exploits that for image compression.


Read the rest of this entry »

Of Colorspaces and Image Compression (Part II)

November 4, 2014

Last week we discussed briefly how we can represent color using either RGB or a colorspace that’s more easily amenable to decimation by a psychovisual model.


This week, we’ll have a look at just how tolerant the human visual system is to color (hue/saturation) variations.

Read the rest of this entry »

Of Colorspaces and Image Compression (Part I)

October 28, 2014

The art of lossy compression consists almost entirely in choosing a representation for your data in which you can easily figure out what data to destroy. That is, if you’re compressing sound, you must transform the waveform into a domain where you can apply a psychoacoustic model that will guide decimation, that is, the deletion of part of the information, part of the information that won’t be heard missing. If you’re rather compressing images—that will be our topic today—you must have a good psychovisual model that will help you choose what information to destroy in your image


The model can be very sophisticated—and therefore computationally expensive—or merely a good approximation of what’s going on in the observers’ eyes (and mind).

But let’s start with the beginning: representing color

Read the rest of this entry »

The Zero Frequency Problem (Part I)

September 23, 2014

In many occasions, we have to estimate a probability distribution from observations because we do not know the probability and we do not have enough a priori knowledge to infer it. An example is text analysis. We may want to know the probability distribution of letters conditional to the few preceding letters. Say, what is the probability of having ‘a’ after ‘prob’? To know the probability, we start by estimating frequencies using a large set of observations—say, the whole Project Gutenberg ASCII-coded English corpus. So we scan the corpus, count frequencies, and observe that, despite the size of the corpus, some letter combinations weren’t observed. Should the corresponding probability be zero?

Read the rest of this entry »

Of Sets and bitmaps

April 8, 2014

When we represent sets, we have many options. We can use a language-specific primitive, like std::set<T> (which is likely list- or tree-like in its detail), or use a bitmap that marks, for each element (and therefore assumes that there is an universal set that contains all elements) whether or not it is included in the set. Bitmaps are simple to implement (especially when one uses something like std::vector<bool>) but need an amount of memory proportional to the universal set, not to the actual subset you’re trying to encode.


We can also use lists, or interval lists. But which one is the most efficient? Under what conditions? Let’s have a look.

Read the rest of this entry »