Quite a while ago, I presented my own simple, sed, grep and bash-based make-depend script. It was simple, and somewhat effective, except for the fact that it did not see include-only dependencies nor quote-includes. If you changed a header affecting a lot of things, it would not rebuild, because the make rules were incomplete. Let’s fix this.
Getting good text data for language-model training isn’t as easy as it sounds. First, you have to find a large corpus. Second, you must clean it up!
Let’s take it easy this week. What about we generate random passwords? That should be fun, right?
Scanning documents or books without expensive hardware and commercial software can be tricky. This week, I give you the script I use to clean up a scanned image (and eventually assemble many of them into a single PDF document).
This week, something short. To run tests, I needed a selection of WAV files. Fortunately for me, I’ve got literally thousands of FLAC files lying around on my computer—yes, I listen to music when I code. So I wrote a simple script that randomly chooses a number of file from a directory tree (and not a single directory) and transcode them from FLAC to WAV. Also very fortunately for me, Bash and the various GNU/Linux utilities make writing a script for this rather easy.
When I write papers or other things, I tend to create separate bib files, so that I don’t end with a giant unsearchable and unmaintainable blob. Moreover, topics tend to be transient, and the bibliography may or mayn’t be interesting in a few year’s time, so, if unused, it can safely sleep in a directory with the paper it’s attached to.
But once in a while, I need one of those old references, and since they’re scatted just about everywhere… it may take a while to find them back. Unless you have a script. Scripts are nice.
Optimizing web content is always complicated. On one hand, you want your users to have the best possible user experience, but on the other hand, you don’t really want to spend much bandwidth delivering the bits.
This week, let’s have a look at how we can optimize images for perceptual quality while minimizing bandwidth. While we could proceed by guesswork—fiddling the parameters until it kind of looks OK—or we can take 5 minutes to write a script that searches the parameter space for the best solution given a constraint, say, perceptual quality.
In a few other entries, I’ve toyed with GPS, either getting or parsing the data with Bash, assessing or using the GPS data. However, when we use GPS, we suppose that the precision varies by brand and model&mdashsome will have greater precision—but our intuition tells us that two GPS devices of the same brand and model should perform identically. That’s what we’re used to with, or at least expect from, computers. But what about USB GPS devices?
So I got two instances of the same model+brand GPS. Let them call them GPS-1 and GPS-2. Do they perform similarly?