8-bit Audio Companding (part II)

February 28, 2017

A few weeks back, I presented an heuristic for audio companding, making the vague assumption that the distribution of values—sound samples—is somewhat exponentially-distributed. But is it the case?

ssound-blocks

Let’s then find out the distribution of the samples. As before, I will use the Toréador file and a new one, Jean Michel Jarre’s Electronica 1: Time Machine (the whole CD). The two are very different. One is classical music, the other electronic music. One is adjusted in loudness so that we can here the very quiet notes as well as the very loud one, the other is adjusted for mostly for loudness, to maximum effect.

So I ran both through a sampler. For display as well as histogram smoothing purposes, I down-sampled both channels from 16 to 8 bits (therefore from 0 to 255). In the following plots, green is the left channel and (dashed) red the right. Toréador shows the following distribution:

toreador

or, in log-plot,

toreador-log

Turns out, the samples are Laplace distributed. Indeed, fitting a mean \mu=127 and a parameter \beta\approx{7.4} agrees with the plot (the ideal Laplacian is drawn in solid blue):

toreador-log-laplace

Now, what about the other file? Let’s see the plots:

time-machine

and in log-plot,

time-machine-log

and with the best-fit Laplacian superimposed:

time-machine-log-laplace

Now, to fit a Laplacian, the best parameters seem to be \mu=127 and \beta\approx{27}. While the fit is pretty good on most of the values, it kind of sucks at the edge. That’s the effect of dynamic range compression, a technique used to limit a signal’s dynamic range, often in a non-uniform way (the signal values near or beyond the maximum value target get more squished). This explains the “ears” seen in the log-plot, also seen in the (not log-)plot.

*
* *

Making the hypothesis that the samples are Laplace-distributed will allow us to devise an efficient quantization scheme for both the limits of the bins and the reconstruction value. In S-law, if we remember, the reconstructed value used is the value in the center of the interval. But, if the distribution is not uniform in this interval, the most representative value isn’t in its center. It’s the value that minimizes the squared expected error. Even if the expression for the moments of a Laplace-distributed random variable isn’t unwieldy, we should arrive at a very good, and parametric, quantization scheme for the signal.


The Zero Frequency Problem (Part I)

September 23, 2014

In many occasions, we have to estimate a probability distribution from observations because we do not know the probability and we do not have enough a priori knowledge to infer it. An example is text analysis. We may want to know the probability distribution of letters conditional to the few preceding letters. Say, what is the probability of having ‘a’ after ‘prob’? To know the probability, we start by estimating frequencies using a large set of observations—say, the whole Project Gutenberg ASCII-coded English corpus. So we scan the corpus, count frequencies, and observe that, despite the size of the corpus, some letter combinations weren’t observed. Should the corresponding probability be zero?

Read the rest of this entry »