Null, row spaces, transpose, fundamental theorem of linear algebra

This is an exciting article, so pay close attention.

In the last article, we introduced the column space, and we are lead to wonder about the vectors that are mapped to the zero vector (let's call the set of these vectors the "null space"). Based on some of the transformations we've seen, we might wonder if the null space is essentially the subspace perpendicular to the column space, i.e. the set of all vectors perpendicular to every vector in the column space (we call this the "orthogonal complement" of the column space).

However, this is demonstrably wrong. Suppose one rotates the co-ordinate system  before collapsing it onto a lower dimension. The "collapsing" matrix, the singular matrix, may be $\left[ {\begin{array}{*{20}{c}}1&0\\0&0\end{array}} \right]$, and the rotation matrix may be $\left[ {\begin{array}{*{20}{c}}0&{ - 1}\\1&0\end{array}} \right]$, then the composite transformation is $\left[ {\begin{array}{*{20}{c}}1&0\\0&0\end{array}} \right]\left[ {\begin{array}{*{20}{c}}0&{ - 1}\\1&0\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}0&{ - 1}\\0&0\end{array}} \right]$

In general in $\mathbb{R}^2$, if the "simple collapse" transformation looks like this:

$$\left[ {\begin{array}{*{20}{c}}{{{\cos }^2}\phi }&{\cos \phi \sin \phi }\\{\cos \phi \sin \phi }&{{{\sin }^2}\phi }\end{array}} \right]$$
(Show that the non-rotating collapses can be written in this way), and the rotation matrix looks like this:

$$\left[ {\begin{array}{*{20}{c}}{\cos \theta }&{\sin \theta }\\{ - \sin \theta }&{\cos \theta }\end{array}} \right]$$
Then their product

$$\left[ {\begin{array}{*{20}{c}}{{{\cos }^2}\phi }&{\cos \phi \sin \phi }\\{\cos \phi \sin \phi }&{{{\sin }^2}\phi }\end{array}} \right]\left[ {\begin{array}{*{20}{c}}{\cos \theta }&{\sin \theta }\\{ - \sin \theta }&{\cos \theta }\end{array}} \right] \\= \left[ {\begin{array}{*{20}{c}}{\cos \phi \cos (\phi  - \theta )}&{\cos \phi \sin (\phi  - \theta )}\\{\sin \phi \cos (\phi  - \theta )}&{\sin \phi \sin (\phi  - \theta )}\end{array}} \right]$$
Which is also a singular matrix with the same rank and column space as the original, but its null space is a rotated version of the original, and therefore no longer perpendicular to the column space.

We make the following observations:
  1. The rotation has introduced an "asymmetry" in the components of the matrix -- the matrix is no longer symmetric across the principal diagonal, the rows and columns are no longer the same. In our original terminology before we introduced matrices, the "contribution of y to x" is no longer the same as "the contribution of x to y
  2. The span of the column vectors is not dependent on $\theta$ (proving again that the column space is not the space orthogonal to the null space), but the span of the row vectors is -- we'll call this the row space. The row space used to be the same as the column space, but is now rotated clockwise by an angle of $\theta$. Since the space is rotated counter-clockwise by $\theta$ before being collapsed, this means the null space is $\theta$ clockwise to what its position would have been without the rotation. 
Hence, the null space is perpendicular to the row space, at least if the pre-collapse transformation is a rotation. It would be a very educated guess to say that this applies generally, that the null space of a transformation is always perpendicular to the row space.

Indeed, this is the case, and is pretty easy to show:

$$\left[ {\begin{array}{*{20}{c}}{{r_1}}\\ \vdots \\{{r_n}}\end{array}} \right]\vec x = 0\,\,\, \Rightarrow \,\,\,{r_i} \cdot \vec x = 0\,\,\,\,\,\,\forall i \in [1,n] \cap \mathbb{Z}$$
More interestingly, however, we made a pretty important observation about the symmetry of a matrix: asymmetry in the matrix seems to be a measure of "rotation-ish" a matrix is. The reason this makes sense is that while the values on the principal diagonal talk about how much each component is scaled in the resulting image, the off-diagonal elements talk about how much contribution there is to one component of a vector from the component in another direction, something that happens during rotations.

If the contribution from x to y is exactly the same as the contribution from y to x, then the xy-plane isn't really being rotated, but the basis vectors are rather being pulled closer to/apart from each other. On the other hand, when they are different, the "orientation" of the axes changes, and an actual rotation is induced.

The most "rotation-ish" effect is created when the values of the matrix are the negative of the their reflection across the principal diagonal, because it means the basis vectors are being rotated by the same angle in the same direction.

Quick exercise: plot how such a transformation looks in $\mathbb{R}^2$. You'll notice that the rotation isn't as "perfect" as you might've hoped -- the vectors change length, etc., and it's the orthogonal displacement in the basis vectors that is the same in orthogonal directions for the basis vectors, not the angles.

These changes in length are a result of scaling, i.e. the values along the principal diagonal.

Well, not exactly, because the values along the principal diagonal scale the part of the basis vectors in their original direction, not the overall length of the basis vectors. This is why (i) the resulting matrix not only eliminates the scaling, but also ensures the rotation is in right-angles, (ii) the resulting matrix is not actually just a bunch of pure rotation, as its rotations are still scaled, and (iii) why some otherwise-antisymmetric matrices with a principal diagonal, like$\left[ {\begin{array}{*{20}{c}}
  {\cos \theta }&{ - \sin \theta } \\
  {\sin \theta }&{\cos \theta }
\end{array}} \right]$, can still be rotations in some angle other than a right angle. Highly recommended reading: my answer to "Intuition behind the Speciality of Symmetric Matrices".

Therefore, it is useful to extract from the matrix the purely anti-symmetric part, with a zero principal diagonal:

$$\left[ {\begin{array}{*{20}{c}}0&{{b_{12}}}& \ldots &{{b_{1n}}}\\{ - {b_{12}}}&0& \ldots &{{b_{2n}}}\\ \vdots & \vdots & \ddots & \vdots \\{ - {b_{1n}}}&{ - {b_{2n}}}& \ldots &0\end{array}} \right]$$
This matrix is called a skew-symmetric matrix or an anti-symmetric matrix, and in $\mathbb{R}^n$ is essentially a combinations of scaled rotations around different axes.

Meanwhile, one may define a symmetric matrix in the following way:

$$\left[ {\begin{array}{*{20}{c}}b_{11}&{{b_{12}}}& \ldots &{{b_{1n}}}\\{{b_{12}}}&b_{12}& \ldots &{{b_{2n}}}\\ \vdots & \vdots & \ddots & \vdots \\{{b_{1n}}}&{{b_{2n}}}& \ldots &b_{nn}\end{array}} \right]$$
These matrices essentially scale and skew vectors -- the principal diagonal components do the scaling in each direction, and the other components do the skewing, i.e. tilting the basis vectors towards each other.

One may wonder if it would have been better to simply deal with diagonal matrices instead of symmetric ones. However, as we will see, it turns out that any matrix can be written as the sum of a symmetric matrix and an anti-symmetric matrix. It is also worth noting that we will eventually see that a symmetric matrix is a generalisation of a scaling matrix, where the scaling is done in some arbitrary directions that need not be the basis vectors. These vectors are called "eigenvectors".

Looking at the definitions of symmetric and antisymmetric matrices, it is clear that any matrix may be written as the sum of a symmetric matrix and an antisymmetric one. Specifically, one may write:

$$A = \underbrace {\frac{1}{2}(A + {A^T})}_{\scriptstyle{\rm{symmetric }}\atop\scriptstyle{\rm{part}}} + \underbrace {\frac{1}{2}(A - {A^T})}_{\scriptstyle{\rm{antisymmetric }}\atop\scriptstyle{\rm{part}}}$$
Where $A^T$ is the "transpose" of $A$, referring to the matrix formed upon taking flipping $A$'s entries across the principal diagonal. E.g. the row space of $A$ is the column space of $A^T$.

Does this all remind you of something?
  • Any entity can be written in two "parts" of a specific nature, which look curiously like the exponential form of the cosine and sine functions
  • These two parts represent scaling and scaled $\pi/2$ rotation respectively.
It should remind you of the Cartesian form of a complex number.

A symmetric matrix is "like" a real number, and an anti-symmetric matrix is "like" an imaginary number. A complex number can be thought of as an object to be transformed (like a vector) as well as as a transformation itself (like a matrix).


However, it is important to highlight the differences: (i) while a real number only scales things in their own direction (like a multiple of $I$), a symmetric matrix can skew things, which is equivalent to scaling across a different set of axes. Therefore while complex numbers can encode spiral transformations, matrices encode all linear transformations. (ii) complex numbers transform on the complex plane, which is a two dimensional plane like $\mathbb{R}^2$, while linear transformations can operate in any number of dimensions, and not necessarily even just in $\mathbb{R}^n$. An example of a consequence of this would be that an anti-symmetric matrix can encode a series of right-angle rotations with corresponding scalings in dimensions greater than three, but an imaginary number can only correspond to a rotation around an axis pointing out of the plane.

It is fair to say that linear algebra generalises complex numbers with matrices. For some examples of correspondence between specific complex numbers, see Introduction to symmetry, section "symmetry on the complex plane". We will formalise this whole idea of matrices being "like real numbers" and "like imaginary numbers" when we do eigenvalues and eigenbases -- in fact, antisymmetric matrices have imaginary eigenvalues.

The complex conjugate, similarly, is generalised to the transpose. Explain how. (Hint: $A^T$ does precisely the opposite rotation-ish, i.e. antisymmetric actions as $A$ while preserving the symmetric actions. What does this tell you about matrices whose transpose and inverse are equal? What does it tell you about matrices that are equal to their transpose?)


Watch the above Khan Academy video.
Come up with an intuitive explanation for why the row space solution is the shortest.


The three following results:

  1. Row space is perpendicular to the null space
  2. Row rank equals column rank
  3. Rank plus nullity equals dimension (Rank-nullity theorem)
Together imply what is called the fundamental theorem of linear algebra -- that the row space is the orthogonal complement of the null space. Why? To be the orthogonal complement, you need to be orthogonal (implied by 1), and your dimensions need to add up to the total dimension of the space. The latter is implied by 3, but our intuition applies perhaps better to the dimension of the column space, and 2 implies that this has the same dimension as the row space, and we're done.

Among these three results, this article provides fairly solid intuition for the first and the third. The second is really assumed throughout this article, but if you want really solid intuition for it, read on.

Consider an $m$ by $n$ matrix $A$ with linearly independent rows (i.e. its row rank is $m$) (we can always turn matrices to this form with a bunch of zero rows at the end via row operations). Then the nullity of $A$, i.e. the number of solutions to $Ax=0$ is $n-m$, as there are $m$ equations in $n$ variables. Thus we have that row rank plus nullity equals the dimension. But we know from our "collapsing intuition" that column rank plus nullity equals the dimension. And we're done.

An obvious consequence of rank-nullity is that injectivity is equivalent to surjectivity. However, this is not true for infinite-dimensional spaces. Why not? You might find it's related to Hilbert's hotel.



It's worth noting that while we've pretended that having the column space equal to the row space is equivalent to being symmetric, this is not really true. There's a broader set of matrices for which the column and row spaces are equal, called "rank-symmetric" or EP matrices.

This article is extended in SVD, polar decomposition, normal matrices; a re-look at transposes and FTLA.

No comments:

Post a Comment