If you’re doing image processing, you’ve probably had to transform your image from one color space to another. In video coding, for example, the RGB image is transformed into YPrPb or YCrCb so that most of the visually relevant information is packed into the Y component which is essentially brightness. Subsampling the chroma bands (Cr and Cb) provides additional means for compression with little perceptual quality loss. While the human eye is very good at detecting brightness variation, it’s not very good at detecting subtle changes in chroma, either saturation or hue.
The thing is that very often, there are color space transformation matrices found in text book but they’re not, due to rounding (and other possible errors), always exactly inverses of each other. This week, I will discuss how we can use projections onto convex sets (POCS) to make sure that reduced precision matrices are exactly (within a given precision) reversible.
So, you open a textbook on image processing and you read about color spaces. Some—but not all—color spaces are linear transform from and to RGB, the color space that represent colors using three components, one of Red, one of Green, and one of Blue. A linear transform is expressible as a matrix-vector product, where the matrix is the color transform itself and the vector the RGB (or the other color space) pixel. So you have
with being the color transform, your pixel in RGB color space, and the resulting color in the other color space. Naturally,
transforms the pixel from the other color space back into RGB color space.
When you find a YCrCb transformation—let us use it as an example as it is most common—in a textbook, and you input it in Mathematica, you’ll find that the exact inverse doesn’t match the inverse given in the textbook, because the inverse in the textbook is only a rounded-down version of the inverse. Indeed:
So if we just truncate or round the precision on the inverse, we find that we loose precision, and the product , which means that the color transform is not reversible. But it turns out that we can compute two matrices and that are inverses of each other, making the color transform quite reversible (minus machine-precision).
One possible method is to use POCS, for projections onto convex sets. While POCS is quite abstract, we can have a look at the animated image opening the post. There are two sets, represented by the circles A and B, and a point laying in A that is the original transform matrix. The gray dashed line between A and B determines the minimal distance between the two sets, where a point in A and a point in B will be the closest; that is, the two matrices that are the closest to being each other’s inverse. We seek those two points. We start with the point in A and project (convexly) it onto B, resulting in a point in B. We take that new point and project it (still convexly) onto A, resulting in a point in A. We repeat the procedure until we reach the dashed gray line.
In our case, the two sets are sets of 3×3 matrices, the original “point” the RGB to YCrCb matrix, and the convex projection is the operation of inverting a matrix and truncating the precision. In our case, we will truncate the precision to a desired accuracy, say, three digits after the decimal point, which is the accuracy of the original matrix. The combination of inversion and truncation forms a convex projection and so we can use POCS. Applying POCS, we repeat inversion+truncation until both points in A and B converge (or we reach a maximum number of iterations).
So Let us look at a Mathematica implementation of the POCS algorithm:
It takes as parameters one of the matrices, a target precision and a maximum number of iterations. The algorithm operates as we just described, repeating inversion and truncating precision (that’s what Round[...,p] does, it rounds to precision p) until both matrices are exact inverses of each other.
POCS converges extremely fast, even more so as the projections are “more convex” or that the spaces are close together. Let us consider examples. Let say we want to find matrix pairs that are inverses of each other but with one matrix as close as possible to the original RGB to YCrCb transform.
With , the algorithm terminates in three iterations:
With , the algorithm also terminates in three iterations:
With , the algorithm loops only once:
In the finest case, the algorithm finds that one of the two transforms remains the same, but “shuffles” precision in the second so that they are very close (up to the wanted precision) of being inverses of one another.
All right, granted, some textbook do give transform pairs to 6 decimals (Wikipedia does, at any rate), but that’s not really the most important point here. POCS is a powerful algorithm that can be applied to a wide variety of optimization problems. In our particular example, we had to optimize two transforms (minimize the difference of their product to the identity) under the constraint of reduced precision in each of the matrices’ coefficients. The matrix inverse and precision truncation form a convex projection (or, more exactly, forms convex sets under that projection) and the algorithm can operate.
We can think of other situations were this algorithm could be applied; limited precision inverse pairs is only one class of such problems. Let us give another example, this time from data compression. Suppose you’re compressing sound and want to optimize the transform quantized coefficients so that the reconstructed piece of sound obeys certain characteristics, whichever they may be. So POCS becomes this: you first transform the sound to frequency domain and quantize the coefficients. You perform inverse transform to obtain an approximation of the original sound wave. You adjust the reconstructed sound wave to have your wanted characteristics (whichever they may be) and compute the transform to frequency domain and quantize again. After a few iterations, the coefficients will converge and you will have the best quantized frequency domain coefficients given the transform you’re using and the sound characteristics you’re imposing on the sound wave.