In a previous post I discussed lossless audio encoding and presented a Bash script using flac to rip and process CD audio files. I also commented on how the psycho-acoustic model used in a MP3 encoder will dominate encoding as bit-rate increases, without much quantitative evidence. In this post, I present some.
The first step is to convert the 4461 .flac files from the previous post to different MP3 bit-rates. For this simple experiment, I picked, 128, 168, and their doubles, 256, 320 kbps, as target bit-rates. For the experiments, I used --vbr-new and -q 0 which forces the best algorithms for psycho-acoustic modeling and bit allocation, which is fairer to Flac as I also used the “best” settings.
I used lame version 3.98.4 and flac version 1.2.1
Let us look at the distribution of resulting bit-rates:
Flac is very dispersed, being unconstrained in terms of bit-rate. It varies between 180 kbps and 1140 kbps (with one freak outlier at 1800 kbps). The various MP3 settings gather round their target bit-rates, much lower than Flac.
And now let’s examine the compression from the different MP3 settings.
In the low bit-rate regime, we see that the encoder tries to max-out the bit-rate. At 128 kbps, the average bit-rate is 124 kbps, and no file exceeds the target rate of 128 kbps. Something similar happens with 160 kbps, averaging at 145 kbps, and then, as the maximum bit-rate increases, the average stays about the same, rising to 153.06 and 153.67 kbps, but distributed symmetrically around the average (something that could not happen with 160 kbps).
|MP3 128 kbps||123.79|
|MP3 160 kbps||145.27|
|MP3 256 kbps||153.06|
|MP3 320 kbps||153.67|
So the results seem to confirm the hypothesis that the psycho-acoustic model dominates compression as bit-rates increases.
It also means that if you’re going to use 160 kbps, you might as well use 320 kbps with maximum quality options enabled. Of course, some of the files will be bigger, but the average file size won’t be significantly larger; moreover, you let the compressor free to use all psycho-acoustic features to yield the best possible sound quality (given that it’s MP3 lossy coding, that is) without trying to quantize too hard.
The complete flac-to-mp3 script can be got here. It depends on flac, metaflac, and lame. It is somewhat multi-threaded: it takes two arguments, one being the target bit-rate (for example, 320, for 320 kbps) and the number of simultaneous threads / songs it can process simultaneously. Of course, it starts a batch of threads and wait for all of them to terminate before launching a new batch. It’s crude, but Bash doesn’t help much with multi-threading and managing sub-shells.