Choosing the Right Pseudoinverse

On a number of previous occasions, I have used the pseudoinverse of a matrix solve systems of equations, and do other things such as channel mixing. However, the demonstration I gave before isn’t entirely correct. Let’s see now why it’s important to make the difference between a left and a right pseudoinverse.

otter

There are basically three possible cases for the matrix A in the equation system

Ax=y:

  • The matrix A is square and non singular (its determinant is different from zero). In this case, the equation system is solved by

    x=A^{-1}y,

    because the inverse of A exists.

  • The matrix A is “tall” (more rows than columns) and its columns are linearly independent. In this case, we will need left pseudoinverse. If A is tall (and its columns linearly independent), then the matrix A^TA is square and non-singular. It can be inverted. Indeed:

    Ax=y,

    A^TAx=A^Ty,

    (A^TA)^{-1}A^TAx=(A^TA)^{-1}A^Ty,

    Ix=(A^TA)^{-1}A^Ty,

    x^*=(A^TA)^{-1}A^Ty,

    where x^* is the best x, the one that minimizes \|Ax-y\|^2.

  • The matrix A is “wide” (more columns than rows) and its rows are linearly independent. This will call for the right pseudoinverse. If A is wide (and its rows linearly independent), then the matrix AA^T is square and non-singular. It can be inverted. Let’s see:

    Ax=y,

    AA^Tx=A^Ty,

    AA^T(AA^T)^{-1}x=A^T(AA^T)^{-1}y,

    x^*=A^T(AA^T)^{-1}y.

The usage is to denote the pseudoinverse of a matrix A as A^+. Here, left and right do not refer to the side of the vector on which we find the pseudo inverse, but on which side of the matrix A we find it. As you know, matrix product is not commutative, that is, in general we have AB\neq{}BA. When the matrix A is square and non-singular, the normal inverse and the right and left pseudoinverse coincide. We have AA^{-1}=A^{-1}A=AA^+=A^+A=I. Otherwise, depending on whether A is tall or wide, we either have A^+A=I or AA^+=I.

*
* *

So what prompted me to revisit the pseudoinverse like this? Well, I was looking into old notes about quadraphonic sound, and realized that since the “compression” matrix (the one that maps four channels onto two) is “wide”, the derivation used for the channel mixing experiment (the otter story) doesn’t quite work. In fact, it doesn’t at all in this case! I foolishly relied on Mathematica, who found the correct right pseudoinverse, and use that pseudoinverse. The derivation for the left pseudoinverse was correct, but I needed the right pseudoinverse. Nemo est perfectus.

Leave a comment