Optimizing JPEG for bandwidth

Optimizing web content is always complicated. On one hand, you want your users to have the best possible user experience, but on the other hand, you don’t really want to spend much bandwidth delivering the bits.


This week, let’s have a look at how we can optimize images for perceptual quality while minimizing bandwidth. While we could proceed by guesswork—fiddling the parameters until it kind of looks OK—or we can take 5 minutes to write a script that searches the parameter space for the best solution given a constraint, say, perceptual quality.

There are many ways of measuring image quality—or degradation—relative to an original image. There is the ssim that is basically a (local) correlation coefficient between the modified image and the original, and there’s the time-honored, but somewhat flawed, snr, the signal-to-noise ratio, measured in decibels. Let’s use the SNR for convenience, as the SSIM isn’t widely supported yet, and in particular, not supported by Imagemagick.

So let us state the problem a bit more formally. We want to send an image (as part of a web page, say), with the least amount of bits, but ensuring at least a given quality. Therefore, we minimize the number of bits subject to the constraint that the quality mustn’t go under a certain threshold. Sounds simple enough.

The first thing is to understand how image files behave with the chosen image standard, say, JPEG. JPEG is old, but is still the standard file format for the delivery of “photographic” images. JPEG achieves compression by using a color-space that concentrates the information in the Y (or luma) component, smooths or subsamples the color components, then further destroys information that “shouldn’t be visible”. So, let’s concentrate only on one parameter, “quality” (that ranges from 0 to 100 even if it’s not a percentage, but merely a heuristic value) that affects, by the amount of information destroyed, the file size.

Let’s start with an image:


And set different target file sizes: 10KB, 20KB, etc., and see how it affects quality. This script generates the images with the desired file sizes:

#!/usr/bin/env bash

file_size=$(( $(stat --format "%s" $1) / 1024 ))

for ((target_size=10;target_size<file_size;target_size+=10))
    convert \
        $file_name \
        -define jpeg:extent=${target_size}kb \
        -sampling-factor 2x2 \
        ${file_name//.*}-$(printf "%04d" ${target_size}).jpg

Imagemagick proceeds by (rather coarse) bisection to find parameters that will make the file as close as possible to the target_size. However, we will come back later on this, it’s coarse and clearly doesn’t explore the parameter space very thoroughly.

This script gathers quality information:

#!/usr/bin/env bash

file_size=$(( $(stat --format "%s" $1) / 1024 ))

for ((target_size=10;target_size<file_size;target_size+=10))
    dest=${file_name//.*}-$(printf "%04d" ${target_size}).jpg

    echo -n $(printf "%04d" $target_size) ' '
    echo -n $(compare \
        -metric psnr \
        $file_name \
        $dest \
        bidon.png 2>&1)
    echo ' ' $(( $(stat --format "%s" $dest ) / 1024 ))

Let us now run both on the target image.

* *

Compiling the qualities into a graph:


The pink rectangle shows the target file sizes where Imagemagick bails out and fails: it doesn’t know how to make files that size, and gives back the original image. Well, that’s inconvenient, but that’s an Imagemagick problem. Let’s ignore those. Let’s see what happens as file size grow.

We don’t see quality shooting up exponentially. Rather, the growth looks logarithmic. After a certain point, letting the file grow larger doesn’t buy you much more quality. As a rule of thumb, image quality is decent for a SNR above 35dB or so, 40dB is good, and “essentially lossless” is 45dB and above. So If we want decent image quality, we can target a file with a quality of 35dB, or 40dB if we must. Running the first script until the quality is met would be wasteful, as certainly we can proceed by binary search.

#!/usr/bin/env bash


file_size=$(stat --format "%s" $file_name )

echo "size:"$file_size


while [ $step -gt 1024 ]
    echo -n "target:"$target_size "step:"$step ' '

    convert \
        $file_name \
        -define jpeg:extent=${target_size} \
        -sampling-factor 2x2 \

    this_quality=$(compare \
        -metric psnr \
        $file_name \
        temp.jpg \
        bidon.png 2>&1)

    this_size=$(stat --format "%s" temp.jpg)
    echo "qual:"$this_quality "size:"$this_size


    if [ $this_quality -lt $target_quality ]


echo -n $this_size/$file_size=
echo "scale=4; $this_size/$file_size.00" | bc -l

A typical output would look something like:

> bisect.sh IMG_2403.jpg 35
target:285041 step:142520  qual:35.4668 size:286383
target:142521 step:71260  qual:33.5261 size:137555
target:213781 step:35630  qual:34.9561 size:205245
target:249411 step:17815  qual:35.1179 size:220920
target:231596 step:8907  qual:35.1179 size:220920
target:222689 step:4453  qual:35.1179 size:220920
target:218236 step:2226  qual:35.1179 size:220920
target:216010 step:1113  qual:34.9561 size:205245

We mentioned before that Imagemagick didn’t seem like it explored the parameters quite exhaustively to meet the target. The above output shows what actually happens: we set a target file size (shown as target: in the above) and we observe an effective file size (shown as size:). We see that for many different target sizes we get just the same file size, the same quality… the same file.

* *

Let’s have a look on what it looks like. Here’s the 30 dB image:


We see artifacts. Blocks. Weird colors. 30 dB doesn’t seem to be enough for this image. What about a bit more, say, 35 dB?


The image went from 80KB to 205KB, but the amelioration is clear. It is much better looking. What if we ask for 40dB?


The file size jumped to almost 500KB! Without much visual amelioration. Plus, Imagemagick didn’t quite managed to produce a 40dB file, just a bit shy of 37dB. Not much smaller than the 570KB original.

* *

Despite Imagemaick’s evident limitation, we still can tweak the image file size so that it, more or less, matches our target visual quality. Some some images, the target will be higher, say 40dB, but often, it seems that good savings are achieved. On another image, originally 600KB, the 380KB version is visually indistinguishable from the original. It seems like a small saving, but cutting more or less 40% of the bandwidth may still translate in good savings (as providers have you pay for every bit of it).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: