The Winding Number: lie bracket

Showing posts with label lie bracket. Show all posts

Lie group homomorphisms

Because a Lie group is fundamentally a group that is also a manifold, we'd like to define a Lie group homomorphism as one that is both a group homomorphism, and smooth. For this, though, we need to define what it means to differentiate a group homomorphism.

Recall that the general notion of a derivative is the idea of "how does the map work locally"? Letting a general function $f:G\to H$ map a curve $\gamma(t)$, it should be easy to see that $\gamma'(t)$ transforms as $(f\circ\gamma)'(t)$ (make sure that this makes sense -- think in terms of the chain rule, or write it out in limit form, or just in terms of the image of the curve).

Consequently this leads to the differential $df:dG\to dH$ (where $dG$ is the Lie Algebra of $G$) defined as $df(\gamma'(0))=(f\circ\gamma)'(0)$. Some short exercises:

Confirm that this is equivalent to saying that $df(X)$ is the directional derivative of $f$ in the $X$ direction.
Differentiate $f(xyx^{-1}y^{-1})$ with respect to $x$ in the $X$ direction at $x=1$ (hint: this is a direct application of the definition of the differential in reverse).
Convince yourself that any derivative operator commutes with $df$, i.e. $D(df(X))=df(D(X))$.

It should be intuitively clear that if $f$ is a homomorphism, its local effect should be to act as a homomorphism of the Lie algebra as it should preserve all local structure. We can easily show that:

Since $df$ is a derivative of $f$, its value must be a linear map (like the Jacobian). This applies to the derivative as an operator on the tangent space of any manifold -- $f$ doesn't need to be a group homomorphism at all.
It preserves the Lie bracket. Take $f(xyx^{-1}y^{-1})=f(x)f(y)f(x)^{-1}f(y)^{-1}$ and differentiate it once with respect to $x$ in the $X$ direction at $x=1$, obtaining: $df(X-yXy^{-1})=df(X)-f(y)df(X)f(y)^{-1}$, simplify and differentiate it with respect to $y$ in the $Y$ direction at $y=1$ to get: $df([Y,X])=[df(Y),df(X)]$.

The adjoint map

The Lie Bracket $[Y,X]$ is not the derivative of conjugation $gxg^{-1}$, so you don't have to worry -- the Lie Bracket is not a Lie algebra homomorphism (it doesn't preserve Lie Brackets), the derivative of conjugation at the identity is zero. That's unfortunate -- our explanation of the Jacobi identity ("a derivation acts through the Lie Bracket as a derivation on the space of derivations where multiplication is given by the Lie Bracket") really indicated that it has something to do with it.

The Lie Bracket is the derivative of conjugation $xgx^{-1}$. OK, so?

Here's the idea: $\mathrm{Ad}(x)(y)=xyx^{-1}$ defines a homomorphism $\mathrm{Ad}:G\to\mathrm{Aut}(G)$. Its differential $\mathrm{ad}:dG\to d\mathrm{Aut}(G)$ can be confirmed to be the Lie Bracket $\mathrm{ad}(X)(Y)=[X,Y]$. So preservation of the Lie Bracket means:

$$\mathrm{ad}([X,Y])=[\mathrm{ad}(X),\mathrm{ad}(Y)]$$
This is precisely the Jacobi identity! So the Lie bracket is a Lie algebra homomorphism, from a Lie algebra to the Lie algebra of half-filled Lie brackets.

There is indeed a relationship between this "homomorphism" understanding of the Jacobi identity and the "derivation" understanding. In general, given a curve $\phi:\mathbb{R}\to\mathrm{Aut}(G)$, differentiating $\phi(t)(gh)=\phi(t)(g)\phi(t)(h)$ at $t=0$ we see that its derivative $d\phi$ satisfies the product rule, i.e. is a derivation (in fact this is true even when $G$ is not a group -- often a Lie group arises this way, as the automorphism group of some object and these derivations then form its Lie algebra). This implies

$$d\mathrm{Aut}(G)\subseteq\mathrm{Der}(dG)$$
So $[X,\cdot]$ is a derivation, and the map from $X$ to $[X,\cdot]$ is a Lie algebra homomorphism $dG\to\mathrm{Der}(dG)$. This really does give us a much more general way to look at everything we talked about in the last article.

Wait -- shouldn't it be an equality? I thought all derivations were part of the Lie Algebra? Ah, but there the derivations on $M$ formed the Lie Algebra of $\mathrm{Aut}(M)$, i.e. $d\mathrm{Aut}(M)=\mathrm{Der}(M)$. So indeed $d\mathrm{Aut}(dG)=\mathrm{Der}(dG)$. This makes sense, indeed $\mathrm{Aut}(G)\subseteq \mathrm{Aut}(dG)$. It's interesting to think about when it is that the Lie algebra has "more" automorphisms than the Lie group.

One may wonder if all automorphisms of a group are a conjugation by something -- or equivalently, if all automorphisms of a Lie algebra are a derivation of some kind. We will later see a special classification of Lie group for which this is true -- in general, the conjugation automorphisms are called the innner automorphisms of the group and are denoted as $\mathrm{Inn}(G)$. The group of all endomorphisms (invertible linear transformations $dG\to dG$) of a Lie algebra, meanwhile are denoted as $\mathrm{End}(dG)$, and it's easy to see that this occurs iff the Lie algebra is Abelian.

Exercise: Show that the map $\mathrm{Ad}:G\to \mathrm{Aut}(G)$ is injective iff $G$ has a trivial center.

So if $G$ has trivial center and all its automorphisms are inner, it is isomorphic to $\mathrm{Aut}(G)$ and is called complete.

The determinant map

The determinant is a homomorphism $\det:GL_F(n)\to F$ from any matrix group. The first thing we'd like to do with this is find its differential $\det'$ (which will be an $F$-valued function on $M_F(n)$). By definition of the differential:

$$\det' A = \lim_{\varepsilon\to 0}\frac{\det (I+\varepsilon A)-1}{\varepsilon}$$
It's easy to prove by writing out the entries of the matrix as $\delta_{ij}+\lambda_{ij}\varepsilon$ and performing induction on the dimension of the matrix that this is equivalent to:

$$\det'A=\mathrm{tr} A$$

Lie algebra homomorphisms in detail: ideals

Well, Lie algebra homomorphisms are a specific category of vector space homomorphisms, aren't they? It's not enough that they preserve the linear structure, they must preserve the Lie bracket too. Well, let's study them in more detail -- like a crash course through linear algebra, but with Lie algebra instead.

What does the kernel of a Lie algebra homomorphism $A$ look like? Well, because the homomorphism preserves linear combinations, the kernel must be a linear subspace -- similarly because the homomorphism preserves the Lie bracket, we must have that $Av=0\implies \forall w\in\mathfrak{g}, A[v,w]=0$, i.e. the kernel must be closed under derivations from $\mathfrak{g}$: $[\mathfrak{g},\mathfrak{i}]\subseteq\mathfrak{i}$. Such a subalgebra is called an ideal.

Exercise: Show that the Lie algebra of a normal subgroup is an ideal (careful -- it's not as obvious as you might think -- but still pretty obvious).

Derivations and the Jacobi Identity

Let's consider a new way to think of the Lie algebra to a group -- instead of just considering the tangent vector to be at the identity, we could smear it across the group to form a vector field, resolving questions of whether our tangent space "really needs to be" at the identity (the exponential map in matrix representation only exists in the traditional form if we're talking about tangent vectors at the identity, but we're free to write down the Lie algebra in this way).

But not every vector field is a valid element of the Lie algebra. We need the vector field to be "constant" across the manifold in some sense so that that constant vector it equals is the tangent-space-at-the-identity element it corresponds to. But what exactly do we mean by "constant" on a Lie Group?

In the case of the unit circle in the complex plane, we have an idea of what we want -- the vector field $T(M)$ is constant over the group if it is determined by the value at the identity as $T(M)=MT(0)$.

Is this preserved in the matrix representation of the group? Well, yes, because the correspondence between complex numbers and spiral matrices is a homomorphism. We can use this as a motivation to define the condition for a vector field to be a Lie algebra on a matrix Lie group -- it needs to be a left-invariant vector field, i.e. we need that the value of the vector field determined as $T(M)=MT(0)$.

Why left-invariant? Why not right-invariant? Why matrix multiplication at all? The choices made here are certainly arbitrary to some extent. When we study abstract lie algebra, we'll just have "left-multiplication by $M$" being replaced by a group action and the usage of matrix multiplication is a choice of representation. In the context of abstract Lie algebra, the "left-multiplication by $M$ we're interested in is really the derivative of the group homomorphism $M:G\to G$, which is a linear map between the tangent spaces at $I$ and $M$. You can show that this map is represented by matrix left-multiplication given a matrix representation (i.e. letting the group be $GL(n,\mathbb{C})$).

Ok, why did we just do that? Why did we upgrade our tangent vectors to vector fields? If it wasn't obvious already, the noncommutativity of a Lie group is "the" feature of importance in a Lie group, at least in some neighbourhood of the identity (we will later find out exactly the kind of features that aren't determined by just the Lie bracket -- the important keywords here are connected and compact) -- if the Lie group is commutative, then the Lie algebra is just a vector space with no additional structure, and the Lie group is a "basically unique" choice.

In our discussions of noncommutativity in the last article, we repeatedly referred to flowing along a vector -- the nature of noncommutativity is inherently "dynamical" in this sense. So we need to talk about differentiating along the corresponding vector field to a tangent vector.

So let's upgrade our vector fields to derivative operators, or derivations $D$. These are operators on functions $f:G\to \mathbb{R}$ that tell you the derivative of $f$ in the direction of the vector field -- the left-invariant ones are a certain generalisation of the directional derivative operators.

Well, what exactly is a derivation? On Euclidean space, directional derivatives can be imagined as stuff of the form $f\mapsto\vec{v}\cdot\nabla f$ -- but this requires the concept of a dot product which is quite weird within the context of matrix groups. But if you try to work this out on the unit circle (do it!), you might get an idea: we can define a curve $\gamma:\mathbb{R}\to G$ passing through a point and consider:

$$f\mapsto(f\circ \gamma)'(t)$$
At the point and you get precisely the directional derivative in the direction $\gamma'(t)$ (show that this is right in Euclidean space, and make sure you understand why it is right/makes sense -- it's the chain rule, and a certain analogy exists to projecting matrices onto subspaces in linear algebra). And if we just want tangent vectors at the identity, we can just consider the operation $f\mapsto(f\circ \gamma)'(0)$.

OK. Let's try to "abstract out" the properties of a derivation $D$, i.e. something that just allows us to define what a derivation is, abstractly, that is equivalent to being an operator of the above form.

What makes an operator a directional derivative? Certainly it must be a linear operator -- but not every linear operator is a directional derivative. The key idea behind a directional derivative is that $D(f(x))$ is determined in a specific way by $D(x)$, the rate at which $x$ changes in the specified direction.

How do we use this? Well, if you think about it a little bit, we can restrict $f$ to be analytic -- so we need:

$D(x)$ predicts $D(x^n)$ in the right way -- this is ensured by the product rule -- $D(fg)=f\ Dg + g\ Df$.
$D(x^n)$ for all $n$ predicts $D(a_0+a_1x+a_2x^2+\ldots)$ in the right way -- this is ensured by linearity.

If anyone can motivate the definition of a derivation without restricting to analytic functions, tell me.

An operator that satisfies these two properties is called a derivation -- one can prove additional properties from these axioms fairly easily, e.g. $D(c)=0$ for constant $c$, etc.

Let's think about why this whole construction above makes sense.

Let $G$ be the group of translations of $\mathbb{R}$ -- one can parameterise them by the translated distance as $\Delta(p)$ with composition given by $\Delta(p)\Delta(q)=\Delta(p+q)$. Well, this is isomorphic to the additive group on the reals, and in turn to the multiplicative positive real numbers. We can consider the group to be acting on real analytic functions by translations of the domain: $\Delta_pf(x):=f(x+p)$ The Lie algebra is just spanned by the derivative of $\Delta(p)$ at the identity, that is:

$$\Delta '(0) = \lim\limits_{h \to 0} \frac{{\Delta (h) - 1}}{h} = \frac{d}{{dx}}$$
And our Lie algebra members are all real multiples of $d/dx$ -- these are precisely the directional derivatives on $\mathbb{R}$. Similar constructions can be made on $\mathbb{R}^n$, or a general automorphism group.

So we see that the "derivations" construction of the Lie algebra actually are the tangent vectors on the Lie group identified as the automorphism group of some object. If you've ever done some differential geometry, this gives you the motivation for treating partial derivatives as basis vectors.

Our discussion of derivations so far works both for derivations (general vector fields on the manifold) and point-derivations (basically tangent vectors at a specific point). Under the first interpretations, though, we're not actually interested in all derivations, only the left-invariant ones. For example, in the example above, an operation of the form of $p(x)\frac{d}{dx}$ is linear and satisfies the product rule:

$$p\frac{d(f\cdot g)}{dx}=g\cdot p\frac{df}{dx}+f\cdot p\frac{dg}{dx}$$
And why shouldn't it? It corresponds to a vector field all right -- $xe_x$. But this is not a left-invariant vector field.

Interpret the Taylor series as the exponential map from the Lie algebra to the Lie group! Make the "similar construction" in the multivariate case ($\mathbb{R}^n$) and interpret the multivariate taylor series as an exponential map -- i.e. that $\Delta=\exp\nabla$

The first thing that we can do with our formalism of point-derivations is give another proof of closure under the Lie Bracket:

$$[D_1,D_2](fg)=f[D_1,D_2]g+g[D_2,D_1]f$$

I.e. that the Lie Bracket of two derivations is also a derivation. Check that the above is correct by expanding stuff out and using the product rule for $D_1$ and $D_2$.

There's another way that derivations can be used to show closure under the Lie Bracket, which shows more closely the connection to the product rule for the second derivative discussed in the previous article.

One might wonder if, like the directional derivative at the identity in the $c'(0)$ direction is given by $(f\circ c)'(0)$, the directional derivative at the identity in the $c''(0)$ direction may be given as $(f\circ c)''(0)$. Well, in general:

$$(f\circ c)''(t)=c''(t)\cdot\nabla f(t)+c'(t)\frac{d}{dt}\nabla f(t)$$
Which since $c'(0)=0$, at $t=0$ is simply equal to the first term, the directional derivative in the $c''(0)$ direction. So we just need to show that $f\mapsto (f\circ c)''(0)$ is a derivation. This follows from the Leibniz rule for the second derivative, and the fact that the first derivative of $c$ is zero.

OK, one more thing before we actually do something useful -- something we haven't done before in other ways.

This is an extended pitfall prevention, because I fell into this pit myself. When thinking about left-invariance of a vector field $D$, I formulated the idea in my head this way: the idea is that under $D$, we should get the same result if we differentiate (derivate?) $f$ at 0 or if we translate it forward by $x$ and derivate it at $x$. i.e. where $\phi^h$ represents the translation $f(x)\mapsto f(x-h)$, we want:

$$D=\phi^{h}D\phi^{-h}$$

(THIS IS WRONG! This is a pitfall prevention, not an actual result!) And I looked at some simple Abelian cases, like the additive real group and the circle group and thought this was clearly true.

But it's wrong. How do we know that? Well, let's consider the group action $\phi^{-h}D\phi^h$ -- certainly at $h=0$, it's the identity, so let's differentiate it (against $h$) at 0. We get, where $d\phi_0$ is the derivative of $\phi$ at 0:

$$[d\phi_0, D]$$
Which isn't zero. So my argument must be wrong -- I must have assumed abelian-ness somehow.

Here's the problem: the final left-multiplication by $\phi^h$ is fine -- it just brings the derived function back to the origin, but "translating the function forward and then differentiating it" messes things up when the direction you're differentiating in doesn't commute with the direction of translation. Draw some pictures of curved surfaces to convince yourselves of this.

So left-multiplication determines a sort of "parallel transport" on the Lie Group, while right-multiplication is an "alternative" way to compare vectors in different tangent spaces, and its disagreement with left-multiplication determines the non-commutativity of the group. Well, this choice of left-multiplication vs right-multiplication is really a convention, arising from the choice of representation.

OK, the useful thing: Suppose we're interested in "nested Lie brackets" $[X,[Y,Z]]$. We're talking about conjugating $[Y,Z]$ as $\phi^p[Y,Z]\phi^{-p}$ where $d\phi_0=X$ so that to first-order in $p$:

$$\phi^p[Y,Z]\phi^{-p}=[Y,Z]+p[X,[Y,Z]]$$
Since conjugation is a homomorphism, we can also write:
$$\begin{align}
\phi^p[Y,Z]\phi^{-p} &= [\phi^pY\phi^{-p},\phi^pZ\phi^{-p}] \\
&= [Y+p[X,Y],Z+p[X,Z]] \\
&= [Y,Z] + p([Y,[X,Z]]+[[X,Y],Z])\\
\Rightarrow [X,[Y,Z]]&=[Y,[X,Z]]+[[X,Y],Z]
\end{align}$$
Now, couldn't we have just have proven this by expanding everything out as commutators? Sure, but this provides more insight as to what's going on -- you might notice the resemblance to the product rule. Indeed, this identity -- the Jacobi identity -- is perhaps best stated as:

"A derivation $X$ acts through the Lie Bracket as a derivation on the space of derivations where "multiplication" is given by the Lie Bracket."

In this sense, it's actually quite expected -- it results from the fact that the Lie Bracket is a bilinear operator obtained from differentiating a group symmetry, conjugation -- this mandates that it is a derivation.

As it turns out, the Jacobi identity, along with the antisymmetry and the bilinearity, determines the Lie Algebra -- it is enough to "abstract out" the properties of a Lie Algebra. Why? This is something we will see over several articles, which will then allow us to motivate abstract Lie algebra.

Lie Bracket, closure under the Lie Bracket

(If you're just here for the easy way to see closure, skip ahead to Closure under the Lie Bracket)

In the previous article, I introduced Lie Groups and Lie Algebras by talking about Lie Algebras as a parameterisation for the Lie Group -- we said that the elements of the Lie Group could be written as exponentials of these parameters (not uniquely, sure, but they can be written in this way). Some things to note here:

What we've called "Lie Groups" refers only to connected Lie Groups, as motivation. In general, the theory of Lie groups considers any group that is also a manifold -- for instance, the non-zero real numbers are also a Lie Group (even though their Lie Algebra is identical to that of the positive real numbers -- can you see why?). We will hereby use this more general definition.
It's not really true that any Lie group can be parameterised in this fashion by writing each element as an exponential of a Lie Algebra element -- even for connected groups. This shouldn't be surprising -- given a term of the form $\exp X$ and a term $\exp Y$, their product $\exp X\exp Y$ is in the group by closure, but it isn't necessarily equivalent to $\exp(X+Y)$ on a non-Abelian group (could it be the exponential of something else? We'll find out later).
A parameterisation of this form is not the same as a co-ordinate system.

The last point is what we will concentrate on in this article, because not being described fully by the Lie algebra is what makes things interesting, right?

What is a co-ordinate system on a manifold? Well, they key point is that any element of the manifold can be decomposed in terms of its components along the co-ordinates. On a Lie Group, this means that there should exist a "basis" for the Lie Group $\exp(X_1),\ldots\exp(X_n)$ corresponding to the basis $X_1,\ldots X_n$ for the Lie Algebra vector space such that every element of the Lie Group can be written as products of powers of these elements, and any rearrangement of the terms in the product should leave it invariant (i.e. the elements should commute with each other).

Note that it is possible to decompose elements of a connected Lie Group as a product of some exponentials, but this is different from there being specifically $n$ elements that one can write any Lie group element as products of.

But clearly, this can only be possible if the group is Abelian, commutative. This is a special case of the more general fact that only a holonomic basis gives rise to a co-ordinate system on a manifold. The idea is -- a closed loop should produce no overall group action. If you flow $\varepsilon$ in the $X$ direction, then flow $\varepsilon$ in the $Y$ direction, then flow $\varepsilon$ back in the $X$ direction and flow $\varepsilon$ back in the $Y$ direction, you should end up back where you started. If you don't, then the resulting difference is the infinitesimal "group commutator" of the Lie Group:

$$e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{\varepsilon Y}$$
One can check via a Taylor expansion that this is equal, to second order, to:

$$1+\varepsilon^2(XY-YX)$$
The first thing to note about this is that the $\varepsilon^1$ term is zero -- this may seem like a surprising coincidence, but perhaps it isn't that surprising (I mean, there's nothing else it could be, right? If the commutator was to first-order $1+\varepsilon z$, $\exp z$ would be equal to 1, and so it would give no characterisation at all of the amount of non-commutativity of the flows $X$ and $Y$) -- it's analogous to vector calculus, where the curl of a vector field is proportional to $\varepsilon^2$ (i.e. a line integral along the curve is proportional to its area, so you divide it by this area in the definition of curl, etc.).

The second-order term, $XY-YX$, is more interesting. This may seem weird because so far, we've been considering the Lie algebra purely as a vector space, with addition and scalar multiplication being the only things going on. But clearly, this cannot be the entire picture, or a connected Lie group would be characterised entirely by the dimension of its Lie algebra. This operation -- the Lie Bracket or Lie Algebra commutator represented by $[X,Y]$ -- as we will see, gives some additional structure to the Lie Algebra, and in fact characterises it (we'll see what this means).

So far, we've obtained no motivation for why this operation $XY-YX$ is actually of any significance. Sure, it appeared in our second-order approximation for the group commutator, but is the group commutator we defined really so great? Surely there could be other ways one could measure the non-commutativity of a group. And the $\varepsilon^2$ business is weird. Things that arise proportional to $\varepsilon$ live in the tangent space, in the Lie Algebra. Where does $[X,Y]$ even live?

Two facts will convince us that the Lie Bracket is indeed the "right" measure of non-commutativity of a Lie Algebra:

The Lie Algebra is closed under the Lie Bracket -- we will see that in fact, $[X,Y]$ lives in the lie algebra, so it is in fact a binary operation on the Lie Algebra, and really does add structure to the Lie Algebra.
It characterises the entire Lie Algebra -- not only is it part of the structure of the Lie Algebra, it characterises the entire structure of the Lie Algebra. What this means is that defining the Lie Bracket on the vector space allows a full characterisation of the part of the group connected to the identity (the "connected part" of the group), so we can say that any Lie Algebras with the same dimension and Lie Bracket are isomorphic.

Closure under the Lie Bracket

If you're like me, you might've thought of several analogous situations to our $1+\varepsilon^2(XY-YX)$ expression -- e.g. in (complex) analysis, at a point where the derivative of a function is zero, the function is characterised by its second derivative (consult Needham's Complex Analysis, p. 205-207 for an explanation). Another example is -- if the first derivative of a function is zero, the second derivative satisfies the product rule (this is actually directly related, in a way we won't go into now).

Here's an idea you might think of: as we discussed earlier, the infinitesimal group commutator is $e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{-\varepsilon Y}= 1+\varepsilon^2 (XY - YX) + O(\varepsilon^3)\in G$. But for a moment let $\varepsilon$ not be infinitesimal. So $\varepsilon (XY - YX) + O(\varepsilon^2)\in \mathfrak{g}$, the Lie Algebra corresponding to Lie Group $G$, so by scaling $XY-YX+O(\varepsilon)\in\mathfrak{g}$ and by connectedness of the vector space $XY-YX\in\mathfrak{g}$.

But this argument is incorrect -- this becomes obvious if you try to formally write it down -- In general, $1+\varepsilon T\in G$ does not imply $T\in\mathfrak{g}$ for non-infinitesimal $\varepsilon$. It's close to an element in $\mathfrak{g}$ (for small $\varepsilon$), but how close? You might get the feeling that it is "sufficiently close", in that the limit $\varepsilon\to0$ of the sequence $\left(c_\varepsilon(X,Y)-1\right)/\varepsilon^2$ (where $c_\varepsilon(X,Y)$ is the group commutator) indeed ends up in the Lie Algebra.

To make this feeling formal, consider instead the curve parameterised differently as $\gamma(\varepsilon)=e^{\sqrt\varepsilon X}e^{\sqrt\varepsilon Y}e^{-\sqrt\varepsilon X}e^{-\sqrt\varepsilon Y}$. Then $\gamma'(0)=XY-YX$, and we're done.

think about the Taylor expansion here of this new curve for a while