Choosing the Right Pseudoinverse

On a number of previous occasions, I have used the pseudoinverse of a matrix solve systems of equations, and do other things such as channel mixing. However, the demonstration I gave before isn’t entirely correct. Let’s see now why it’s important to make the difference between a left and a right pseudoinverse.

There are basically three possible cases for the matrix $A$ in the equation system

$Ax=y$ :

The matrix $A$ is square and non singular (its determinant is different from zero). In this case, the equation system is solved by
$x=A^{-1}y$ ,

because the inverse of $A$ exists.
The matrix $A$ is “tall” (more rows than columns) and its columns are linearly independent. In this case, we will need left pseudoinverse. If $A$ is tall (and its columns linearly independent), then the matrix $A^TA$ is square and non-singular. It can be inverted. Indeed:
$Ax=y$ ,

$A^TAx=A^Ty$ ,

$(A^TA)^{-1}A^TAx=(A^TA)^{-1}A^Ty$ ,

$Ix=(A^TA)^{-1}A^Ty$ ,

$x^*=(A^TA)^{-1}A^Ty$ ,

where $x^*$ is the best $x$ , the one that minimizes $\|Ax-y\|^2$ .
The matrix $A$ is “wide” (more columns than rows) and its rows are linearly independent. This will call for the right pseudoinverse. If $A$ is wide (and its rows linearly independent), then the matrix $AA^T$ is square and non-singular. It can be inverted. Let’s see:
$Ax=y$ ,

$AA^Tx=A^Ty$ ,

$AA^T(AA^T)^{-1}x=A^T(AA^T)^{-1}y$ ,

$x^*=A^T(AA^T)^{-1}y$ .

The usage is to denote the pseudoinverse of a matrix $A$ as $A^+$ . Here, left and right do not refer to the side of the vector on which we find the pseudo inverse, but on which side of the matrix $A$ we find it. As you know, matrix product is not commutative, that is, in general we have $AB\neq{}BA$ . When the matrix $A$ is square and non-singular, the normal inverse and the right and left pseudoinverse coincide. We have $AA^{-1}=A^{-1}A=AA^+=A^+A=I$ . Otherwise, depending on whether A is tall or wide, we either have $A^+A=I$ or $AA^+=I$ .

*
* *

So what prompted me to revisit the pseudoinverse like this? Well, I was looking into old notes about quadraphonic sound, and realized that since the “compression” matrix (the one that maps four channels onto two) is “wide”, the derivation used for the channel mixing experiment (the otter story) doesn’t quite work. In fact, it doesn’t at all in this case! I foolishly relied on Mathematica, who found the correct right pseudoinverse, and use that pseudoinverse. The derivation for the left pseudoinverse was correct, but I needed the right pseudoinverse. Nemo est perfectus.

This entry was posted on Tuesday, January 17th, 2017 at 14:24 pm and is filed under algorithms, Mathematics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Harder, Better, Faster, Stronger