Loading [MathJax]/jax/output/CommonHTML/jax.js

Matrix decompositions

Some decompositions

The algorithms for the QR and LU decompositions are fairly self-explanatory from their definition.

The Cholesky decomposition can be computed for a nonnegative-definite matrix through a simultaneous LU decomposition from both sides. Starting with a matrix in the block form:

A=[a1vTvA2]
We perform an LU decomposition first on the columns, from the left, then on the rows, from the right:

[a1vTvA2]=[a10v/a11][a1vT/a10A2vvT/a1]=[a10v/a11][100A2vvT/a1][a1vT/a01]
The process can then be continued to give an LU decomposition where the L and U are transpose, thus the Cholesky decomposition.

(Why does the matrix have to be nonnegative-definite?)

Eigenvalue algorithm

A lot of factorisations are fundamentally related to the "eigen"-stuff: this includes the change-of-basis factorizations: diagonalization, SVD, Schur -- and also the polar decomposition. Eigenstuff is fundamentally analytic (whatever that means), so terminating algorithms won't work, we need an asymptotic algorithm.

A simple such algorithm to calculate the eigenvectors and eigenvalues is power iteration. It is related to the flows of systems of linear differential equations, in which the eigenspace with the largest eigenvalue provides the only stable equilibrium of the system. Unless you choose an initial vector v that happens to itself be an eigenvector, Anv will approach the projection of v onto the principal eigenspace of A.

So that gives you the largest eigenvalue. There are many ways to get the remaining eigenvalues -- one inefficient way I thought of was to subtract off the part corresponding to the stretching of the primary eigenvector, but this is incredibly inefficient and numerically unstable.

An actually sensible approach is as follows: consider the matrix (AμI)1, which has the same eigenspaces as A. The largest eigenvalue of this is the eigenvalue of A closest to μ. By varying μ across the real line, one can discover all the eigenvalues of A.

Schur algorithm

You can use the eigenvalue algorithm to calculate the Schur decomposition directly via its definition -- however, there is a more efficient method.

A triangular matrix is invariant under conjugation by the Q in its own QR decomposition (because the Q is the identity matrix). So similar to power iteration, one may iteratively QR-factorize A and replace it with RQ. This is called the QR algorithm.

Apparently this is expensive in its crude form, so you instead bring into a form called "upper Hessenberg form" (upper-triangular but with a subdiagonal) and that apparently makes the computations less expensive. The mechanism to bring a matrix into upper Hessenberg form is based on householder reflections, and involves doing householder reflections on the parts of the columns below the subdiagonal, so the corresponding householder reflections on the rows (since the Schur is a change-of-basis transformation) do not interfere with the column of interest and screw up all your zeroes. It's not very interesting.

Least-squares

Suppose you wanted to solve Axb, i.e. find the x that minimizes Axb. This is relevant when A is not surjective.

The idea is that Axb is minimised when Ax is the projection bA of b onto the image of A. One can equivalently write AT(Axb)=0, as Axb is perpendicular to the column space of A. Thus it suffices to solve ATAx=ATb.

If A is full-rank (this does not mean surjective), we can actually have a unique x, given by x=(ATA)1ATb, and A+=(ATA)1AT is called the "Moore-Penrose pseudoinverse" of A.

An alternative, more efficient algorithm is to use the QR factorization (the one with the non-square Q) to construct the projector as QQT (which works, as Q definitionally provides a basis for the image, and QQT satisfies the defining property of a Hermitian projector). Then one can solve QRx=QQTb, or equivalently Rx=QTb. In the full-rank case, R1QT is the Moore-Penrose pseudoinverse.

To add:
  • SVD algorithm
  • Algorithm complexity for each thing

No comments:

Post a Comment