Let’s say we want to mix three channels onto two because the communication device has only two available channels but we still want to emulate a three channel link. If we can afford coding, then it’s not a problem because we can build our own protocol so add any number of channels using a structured data stream. But what if we cannot control the channel coding at all? In CDs, for example, there’s no coding: there are two channels encoded in PCM and a standard CD player wouldn’t understand the sound if it was encoded otherwise.
The solution is to mix the three channels in a quasi-reversible way, and in a way that the two channels can be listened to without much interference. One possible way is to mix the third channel is to use a phase-dependant encoding. Early “quadraphonic” audio systems did something quite similar. You can also use a plain time-domain “mixing matrix” to mix the three channels onto two. Quite expygeously, let us choose the matrix:
and we compute the mixing of the three original channels , , onto the mixed channels and as:
Which is a standard matrix-vector multiplication, so no big deal. At the other end, you want to decode into the best approximation of you can, but lo! the mixing matrix isn’t square, it’s singular, so you cannot use the straight matrix inverse to recover . So let be our matrix. In general, if is non-invertible, we can use the normal equations to solve for the best-effort inverse of . Consider that and are vectors (unrelated to our channels):
and the part is the Moore-Penrose Pseudo-Inverse of the matrix , denoted . However, this method works if exists. Unfortunately for us, is still singular, thus non-invertible.
Ach! Großes malheur! Well, no, not that much. If is non-singular, therefore invertible, we can use a variant of the pseudo-inverse that’s the “right” pseudo-inverse of M, so that instead of , we’ll have . Indeed, let the right pseudo-inverse be:
We can show rather quickly that it is indeed an inverse of :
And we’re done. ■
So, using the right pseudo-inverse, we find that:
and we can recover a good approximation of the original vector:
Let us see what such a channel mixing might look like. I have written a small C++ program that reads a picture and produces a image (mapped onto the red-green plane for display purposes) and recovers the best-effort inverse in , where the tilde denotes approximation. So, using the above equations, replacing by , getting , then, using the pseudo-inverse, .
Let me demonstrate with this otter picture (sorry, don’t know the source):
Using the mixing matrix we get a image:
Which is of course rendered using the red-green plane for display purposes. The plane isn’t really a color plane; it’s a projection of the RGB space onto an arbitrary plane. Using the pseudo-inverse , we recover a good approximation of the original image:
Which of course, isn’t quite the original because, despite being the best-effort inverse of , the original transform is a surjective mapping of a 3D space onto a 2D plane.
So why is the pseudo-inverses so interesting? In general, they’re useful as soon as you have to inverse a singular matrix to solve a problem. Square matrices can be singular by having a determinant of zero, resulting from insufficient rank, for example. It can also be that the matrix is rectangular. Rectangular matrices can have exact inverses in some rare cases, but in general, we must rely on a best-effort inverse. Rectangular matrices are seen in the context of least squares regression.
In the case of simple linear regression, we have direct formulae to find the slope and intercept of a line passing through a cloud of points that minimizes the square error. If we want to fit a more complex function, the algebraic method of computing partial derivative and isolating the sought variable one by one becomes increasingly painful with each added variable. Using the matrix notation of the problem—and the pseudo-inverse—provides a “one size fits all” solution to all regression problems where the function is a linear combination of functions of input values.
You can choose between the left and right pseudo-inverse in order to minimize computation time. Indeed, if is , is while is . As the matrix inverse is a rather expensive computation, in the order of for a matrix, it may be quite a bit faster to invert the smallest of the two pseudo-inverse kernels!
Ah, also, before you ask, I’m sure you were wondering about some of the operations used in the demonstrations. The only thing one must really know is that matrix multiplication is associative but not commutative. That is, but in general. Also, you need the following rules: