Orthogonality makes a cameo in inverse matrices

Say $A$ and $B$ are $n \times n$ square matrices and $AB = I$. In other words, $A$ is $B^{-1}$ and $B$ is $A^{-1}$.

Here’s a question for you, from one of Boyd’s linear dynamical systems lectures: what’s the inner product of the $i$th row of $A$, $\tilde{a}_i$, and the $j$th column of $B$, $b_j$? And what does the value of this inner product say about $A$ and $B$?

Here’s the answer: $\tilde{a}_i \cdot b_j$ is $I_{i,j}$, so it’s $0$ when $i \neq j$, and $1$ when $i = j$. In other words, the $i$th row of $A$ is orthogonal to the $j$th column of $B$ unless $i = j$. As Boyd notes, this fact follows naturally from the inner product interpretation of the matrix multiplication $AB = I$. Yet it seems surprising. Can we get some intuition for why orthogonality shows up here?

Turns out, yeah, absolutely. This orthogonality stuff is wonderful intuition for how inverse matrices retrieve coordinates.

First, note that $A$ is invertible, so its columns form a basis.¹ So, we can express any $n$-dimensional vector $v$ as a linear combination of $A$’s columns:

\[v = c_1 a_1 + \cdots + c_n a_n,\]

where $a_i$ are the basis vectors and $c_i$ are the coordinates of $v$ in this basis.

So, we start with a column of coordinates $\begin{bmatrix} c_1 & \cdots & c_n \end{bmatrix}^t$, and apply $A$ to it to get $v$. We’d like $A^{-1}v$ to give us back $\begin{bmatrix} c_1 & \cdots & c_n \end{bmatrix}^t$.

This is in fact exactly what $A^{-1}$ does: it “picks out” the coordinates $c_i$. If you want to find $c_i$ (the $i$th coordinate of $v$ in the basis formed by the columns of $A$), you can take the dot product of the $i$th row of $A^{-1}$ with $v$. Specifically, if $v = c_1 a_1 + \cdots + c_n a_n$, then dotting the $i$th row of $A^{-1}$ with $v$ will give you $c_i$. Why? Because the dot products of the $i$th row of $A^{-1}$ with $a_1, \ldots, a_{i-1}, a_{i+1}, \ldots, a_n$ will all give zero (by the opening fact of this post!). Only the dot product of the $i$th row of $A^{-1}$ with $a_i$ will “survive” to give $1$, and that will multiply with $a_i$’s coefficient $c_i$ to recover the coordinate $c_i$. Repeat for all rows of $A^{-1}$, and you’ll get back exactly the entire column you started with: $\begin{bmatrix} c_1 & \cdots & c_n \end{bmatrix}^t$.

The orthogonality condition between the the rows of $A^{-1}$ and the columns of $A$ ensures that each component $c_i$ gets mapped back exactly where it came from, with no “cross-talk” between different components.

Why? Because for $A$ to have an inverse transformation — one that exactly undoes its effect on a vector — $A$ must be both one-to-one, i.e., different inputs map to different outputs, and onto, i.e., every vector in the output space can be reached. Both of these are necessary conditions for an inverse to exist. If $A$ is not one-to-one, we won’t be able to find exactly which input produced an output. If $A$ is not onto, there could be some outputs we can’t reach with any inputs. Now, why does the columns of $A$ forming a basis ensure that $A$ is both one-to-one and onto? Because if the columns are spanning, $A$ is onto: any output can be reached by a certain linear combination of the columns. If the columns are linearly independent, every output is a unique linear combination of the columns. If the columns are both spanning and independent, we have ourselves a basis! ↩