Choosing Random Files

This week, something short. To run tests, I needed a selection of WAV files. Fortunately for me, I’ve got literally thousands of FLAC files lying around on my computer—yes, I listen to music when I code. So I wrote a simple script that randomly chooses a number of file from a directory tree (and not a single directory) and transcode them from FLAC to WAV. Also very fortunately for me, Bash and the various GNU/Linux utilities make writing a script for this rather easy.

dice

There are two main components to the script: 1) finding the files, which is readily done using the dreadfully capricious command find (which, methinks, is one of the four horseman of the scriptocalypse alongside sed, awk and bc) and 2) sort which as the oxymoronic option --random-sort, which, as it implies, randomizes the input lines.

In this particular occasion find‘s usage is minimal: a root directory under which the files are to be found and a mask, in this case, *flac. To deal with spaces in file-names, find‘s output is passed, via a pipe to sort --random-sort, whose output is piped to head (with an argument) to extract the first n files. This output is in turn piped to a subshell that reads, line by line, file-names and applies deflaculation.

#!/usr/bin/env bash

location=$1

if [ "$location" != "" ]
then
    nb=${2:-20}
    pattern=${3:-flac}

    find $location -iname '*'$pattern \
        | sort --random-sort \
        | head -n $nb \
        | \
        (
            while read filename
            do
                bn=$(basename "$filename")
                nw=${bn%.*}.wav
                echo -\> $nw
                flac --silent --decode "$filename" -o "$nw"
            done
        )

else
    echo must provide location/directory 1>&2
    exit 1
fi

Here, the only real trick is to pipe, line by line, file-names to a subshell. This is, maybe, the simplest way to deal with a filename such as “DJ Champion — N⁰1 – 013 – Grand Prix.flac”.

There’s also a Bash substitution pattern, ${bn%.*} that strips the file’s basename of its extension.

You also might have noticed the unusual patterns in nb=${2:-20} and pattern=${3:-flac}, which provides default values if the second and third script arguments ($2 and $3) aren’t provided.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: