Scanning documents or books without expensive hardware and commercial software can be tricky. This week, I give you the script I use to clean up a scanned image (and eventually assemble many of them into a single PDF document).
This week, something short. To run tests, I needed a selection of WAV files. Fortunately for me, I’ve got literally thousands of FLAC files lying around on my computer—yes, I listen to music when I code. So I wrote a simple script that randomly chooses a number of file from a directory tree (and not a single directory) and transcode them from FLAC to WAV. Also very fortunately for me, Bash and the various GNU/Linux utilities make writing a script for this rather easy.
Computationally inexpensive sound compression is always difficult, at least if you want some quality. One could think, for example, that taking the 8 most significant bits of 16 bits will give us 2:1 (lossy) compression but without too much loss. However, cutting the 8 least significant bits leads to noticeable hissing. However, we do not have to compress linearly, we can apply some transformation, say, vaguely exponential to reconstruct the sound.
That’s the idea behind μ-law encoding, or “logarithmic companding”. Instead of quantizing uniformly, we have large (original) values widely spaced but small (original) value, the assumption being that the signal variation is small when the amplitude is small and large when the amplitude is great. ITU standard G.711 proposes the following table:
So for an experiment I ended up needing conversions between 8 bits and 16 bits samples. To upscale an 8 bit sample to 16 bits, it is not enough to simply shift it by 8 bits (or multiply it by 256, same difference) because the largest value you get isn’t 65535 but merely 65280. Fortunately, stretching correctly from 8 bit to 16 bit isn’t too difficult, even quite straightforward.
While flipping the pages of a “Win this interview” book—just being curious, not looking for a new job—I saw this seemingly simple question: how would you compute the sum of a series of floats contained in a array? The book proceeded with the simple, obvious answer. But… is it that obvious?
Something that used to bug me—used to, because I am so accustomed to work around it that I rarely notice the problem—is that in neither C nor C++ you can use strings (const char * or std::string) in switch/case statement. Indeed, the switch/case statement works only on integral values (an enum, an integral type such as char and int, or an object type with implicit cast to an integral type). But strings aren’t of integral types!
In pure C, we’re pretty much done for. The C preprocessor is too weak to help us built compile-time expression out of strings (or, more exactly, const char *), and there’sn’t much else in the language to help us. However, things are a bit different in C++.
More often than I’d like, simple tasks turn out to be not that simple. For example, displaying (beautifully) a binary tree for debugging purpose in a terminal. Of course, one could still use lots of parentheses, but that does not lend itself to a very fast assessment of the tree. We could, however, use box drawing characters, as in DOS’s goode olde tymes, since they’re now part of the Unicode standard.