The 1 bit = 6 dB Rule of Thumb, Revisited.

March 28, 2017

Almost ten years ago I wrote an entry about the “1 bit = 6 dB” rule of thumb. This rule states that for each bit you add to a signal, you add 6 dB of signal to noise ratio.

The first derivation I gave then was focused on the noise, where the noise maximal amplitude was proportional to the amplitude represented by the last bit of the (encoded) signal. Let’s now derive it from the most significant bit of the signal to its least significant.

Read the rest of this entry »

New Block Order

March 21, 2017

The idea of reorganizing data before compression isn’t new. Almost twenty five years ago, Burrows and Wheeler proposed a block sorting transform to reorder data to help compression. The idea here is to try to create repetitions that can be exploited by a second compression engine.

But the Burrows-Wheeler transform isn’t the only possible one. There are a few other techniques to generate (reversible) permutations of the input.

Read the rest of this entry »

r4nd0m pa$$w0rd

March 14, 2017

Let’s take it easy this week. What about we generate random passwords? That should be fun, right?


Read the rest of this entry »

Much Ado About Nothing

March 7, 2017

A rather long time ago, I wrote a blog entry on branchless equivalents of simple functions such as sex, abs, min, max. The Sing EXtension instruction propagates the sign bit in the upper bits, and is typically used in the promotion of, say, a 16 bits signed value into a 32 bits variable.

But this time, I needed something a bit different: I only wanted the sign-extended part. Could I do much better than last time? Turns out, the compiler has a mind of its own.

Read the rest of this entry »

8-bit Audio Companding (part II)

February 28, 2017

A few weeks back, I presented an heuristic for audio companding, making the vague assumption that the distribution of values—sound samples—is somewhat exponentially-distributed. But is it the case?


Let’s then find out the distribution of the samples. As before, I will use the Toréador file and a new one, Jean Michel Jarre’s Electronica 1: Time Machine (the whole CD). The two are very different. One is classical music, the other electronic music. One is adjusted in loudness so that we can here the very quiet notes as well as the very loud one, the other is adjusted for mostly for loudness, to maximum effect.

So I ran both through a sampler. For display as well as histogram smoothing purposes, I down-sampled both channels from 16 to 8 bits (therefore from 0 to 255). In the following plots, green is the left channel and (dashed) red the right. Toréador shows the following distribution:


or, in log-plot,


Turns out, the samples are Laplace distributed. Indeed, fitting a mean \mu=127 and a parameter \beta\approx{7.4} agrees with the plot (the ideal Laplacian is drawn in solid blue):


Now, what about the other file? Let’s see the plots:


and in log-plot,


and with the best-fit Laplacian superimposed:


Now, to fit a Laplacian, the best parameters seem to be \mu=127 and \beta\approx{27}. While the fit is pretty good on most of the values, it kind of sucks at the edge. That’s the effect of dynamic range compression, a technique used to limit a signal’s dynamic range, often in a non-uniform way (the signal values near or beyond the maximum value target get more squished). This explains the “ears” seen in the log-plot, also seen in the (not log-)plot.

* *

Making the hypothesis that the samples are Laplace-distributed will allow us to devise an efficient quantization scheme for both the limits of the bins and the reconstruction value. In S-law, if we remember, the reconstructed value used is the value in the center of the interval. But, if the distribution is not uniform in this interval, the most representative value isn’t in its center. It’s the value that minimizes the squared expected error. Even if the expression for the moments of a Laplace-distributed random variable isn’t unwieldy, we should arrive at a very good, and parametric, quantization scheme for the signal.

Cleaning Scans

February 21, 2017

Scanning documents or books without expensive hardware and commercial software can be tricky. This week, I give you the script I use to clean up a scanned image (and eventually assemble many of them into a single PDF document).


Read the rest of this entry »

Choosing Random Files

February 14, 2017

This week, something short. To run tests, I needed a selection of WAV files. Fortunately for me, I’ve got literally thousands of FLAC files lying around on my computer—yes, I listen to music when I code. So I wrote a simple script that randomly chooses a number of file from a directory tree (and not a single directory) and transcode them from FLAC to WAV. Also very fortunately for me, Bash and the various GNU/Linux utilities make writing a script for this rather easy.


Read the rest of this entry »