The Winding Number: category theory

Showing posts with label category theory. Show all posts

Incredible Duals of category theory

You might have noticed in the past two articles [e.g.] [e.g.] that the constructions we define so often come in pairs. And there is a very specific relationship that defines these pairs, too -- they arise from reversing the arrows in the construction.

I'm aware that this appears bizarre, but there's nothing else it could arise from, right? Duality shows up everywhere in mathematics, but the only "natural" notion of duality we get with arrows is to reverse them. So somehow, if category theory is to formalise all mathematical intuition in some way, then all the good dualities you see must somehow be nicely expressible in terms of this category theoretic duality.

To drive this point home, let's just list out a bunch of "good things appearing in twos" we've seen in mathematics (other than what we've already seen) and see how they can be expressed categorically.

Dual order (i.e. given a relation $\le$, the dual order is defined by $\ge$) -- this one is just trivial. Every poset is a preordered set and therefore can itself be seen as a category, so the dual order is the dual of this category.
Sup, Inf (i.e. the lower bound of the upper bounds, the upper bound of the lower bounds in a lattice) -- this follows directly from the above.
LCM, GCD -- special case of the above, as the integers can be organised into a lattice based on divisibility.
de Morgan duality (i.e. the duality induced by the complement/negation operation in Boolean algebra, e.g. between unions and intersections) -- in the case of logic, propositions can be ordered by implication; in the case of set theory, sets can be ordered by inclusion.
Addition and multiplication -- Follows from the duality of sums and products of objects by considering the category of finite sets.
A subspace and its orthogonal complement (more generally, a subobject and a quotient by it, i.e. where $O\to A\to B\to C\to O$, $A$ and $C$).

To make this more precise:

We define the opposite category as the category with all arrows reversed. Where an invertible functor between these categories maps a diagram to another, those diagrams are called dual notions.

Abstracting our abstractions: "limits of cones", universal properties

The last article, Abstracting some categorical definitions, saw the same kind of construction repeated over and over: given some diagram, we'd ask for an object with morphisms to or from that diagram (demanding that the diagram commute) -- such an object would be a "candidate" for our construction, and we'd then ask for the "maximum" or "minimum" among such constructions.

That this notion appears so frequently makes sense. It's really a generalisation of the notions of initial and final topologies, and comes from the notion that an object is defined by its morphisms to or from other objects, and that we're interested in constructions that are unique up to isomorphism.

So consider some diagram in $\mathcal{C}$. As we will later see, this is formally a "functor" (morphism between categories) from an indexing category $\mathcal{I}$ to $\mathcal{C}$ -- denote it as $X:\mathcal{I}\to\mathcal{C}$.

We define a cone to $\mathcal{C}$ as an object $M$ together with morphisms $m_i:M\to X_i$ such that it commutes with the existing diagram (formally, such that for every morphism $f:i\to j$ in $\mathcal{I}$, we have $F(f)\circ m_i=m_j$).

Now this necessarily represents an object with "more information than each $X_i$" -- so we're interested in the "infimum" of these cones, the one with the least information, the one to which there exists a morphism from any other cone. The limsup of cones, if you will:

We define the limit $(L,\ell_i)$ of the diagram to be a cone such that for any cone $(M,m_i)$ to the diagram, $\exists!\ u:M\to L$ such that the diagram commutes, i.e. $m_i=\ell_i\circ u$ forall $i$.

The above diagram commutes, and the purple morphism is unique.

And the dual notion is also observed, a liminf:

We define a co-cone from $\mathcal{C}$ as an object $\overline{M}$ together with morphisms $\overline{m}_i:X_i\to \overline{M}$ such that it commutes with the existing diagram (formally, such that for every morphism $f:i\to j$ in $\mathcal{I}$, we have $\overline{m}_j\circ F(f)=\overline{m}_i$).

We define the co-limit $(\overline{L},\overline{\ell}_i)$ of the diagram to be a cone such that for any cone $(\overline{M},\overline{m}_i)$ to the diagram, $\exists!\ u:\overline{L}\to \overline{M}$ such that the diagram commutes, i.e. $\overline{m}_i=u\circ\overline{\ell}_i$ forall $i$.

The below diagram commutes, and the purple morphism is unique.

Alternatively, the limit and colimit may be characterised as the final object in the category of cones and the initial object in the category of co-cones respectively (check that this makes sense).

Examples:
Diagrams captioned by their limits.

	(empty diagram) Limit: final object Co-limit: initial object
	(discrete diagram) Limit: product Co-limit: co-product This answers the difficult cases of the empty product (it's just the final object) and the power (use the constant functor).
	(parallel diagram) Limit: equaliser Co-limit: co-equaliser

Exercises: Do some examples to convince yourself of the following ideas:

Even if there are a bunch of morphisms in the diagram, the limit of the diagram talks fundamentally about the product of the "starting" objects of the diagram (think of: $X\rightarrow Y\leftarrow Z$, etc.).
If your original diagram has non-commuting features, the limit of the diagram talks about equalisers of these features (think of: parallel diagram, reverse-parallel diagram $\leftrightharpoons$, other cycles, a diagram with non-trivial automorphisms).
Adding commuting stuff doesn't change the limit (i.e. the limit of $X\to Y\to Z$ is the same if you add another morphism $X\to Z$).

Universal objects and comma categories

You may have noticed that images and coimages cannot be written as limits and colimits (do you see why?). We made a fairly specific specialisation when defining limits/colimits that doesn't really have to do with our "limsup/liminf" intuition -- we insisted we had morphisms either from or to the diagram, whereas we could in general have a more complicated property.

In general, instead of dealing with the category of cones, we could deal with some other category (called the comma category) and discuss its initial and final objects instead.

The key insight regarding this generalisation is as follows: one can see the limit as a construction in the category $\mathcal{C}^{\mathcal{I}}$ of diagrams in $\mathcal{C}$ of a certain shape $\mathcal{I}$. The limit object (which is an object in $\mathcal{C}$) can be "upgraded" to that category as the constant diagram (an element of $\mathcal{C}^{\mathcal{I}}$ that maps every node in the diagram shape to the same object in $\mathcal{I}$) (this "upgrading" is called the diagonal functor $\Delta: \mathcal{C}\to\mathcal{C}^\mathcal{I}:=\lambda M.\ (\lambda i.\ M)$) with a morphism to the object of $\mathcal{C}^\mathcal{I}$ we're actually taking the limit of.

So more generally, we can consider some category other than $\mathcal{C}^\mathcal{I}$, and a more general functor than $\Delta$, in order to formalise a more general notion of being a limiting object. We make the following definition:

We define the final morphism from a functor $F:\mathcal{C}\to\mathcal{D}$ to an object $D\in\mathcal{D}$ as a morphism $\ell:F(L)\to D$ such that for any morphism $m:F(M)\to D$, there $\exists!\ u: M\to L$ such that the diagram commutes, i.e. $m=\ell\circ F(u)$.

You may observe that $F$ generalised $\Delta$, $\mathcal{D}$ is the generalisation of the "category of diagrams", and the final morphism generalises the limit (with $L$ being the limit "object" in $\mathcal{C}$). Analogously we define, generalising the colimit:

We define the initial morphism to a functor $F:\mathcal{C}\to\mathcal{D}$ from an object $D\in\mathcal{D}$ as a morphism $\overline{\ell}:D\to F(\overline{L})$ such that for any morphism $\overline{m}:D\to F(\overline{M})$, there $\exists!\ u: \overline{L}\to \overline{M}$ such that the diagram commutes, i.e. $\overline{m}=F(u)\circ\overline{\ell}$.

These terms "final morphism" and "initial morphism" are not to be confused with the morphisms to and from an initial object or a final object, that we defined previously. Typically, these terms are used in neither context -- one simply says "universal morphism" to/from $D$ from/to $F$; and in the previous context, one simply says morphisms to a final object/from an initial object.

In general, these morphisms are referred to as universal morphisms or universal objects.

(By the way: the term "universal property" is just used to refer to the property of being initial or terminal or whatever.)

This notion can easily be restated as follows: given an object $D\in\mathcal{D}$ and a functor $F:\mathcal{C}\to\mathcal{D}$, one can construct the following:

The comma category $[F\to D]$ is a category whose objects are the morphisms $m:F(M)\to D$, and whose morphisms from $m_1\to m_2$ are given by morphisms $u:M_1\to M_2$ such that the diagram commutes, i.e. such that $m_1=m_2 \circ F(u)$.

The cocomma category $[D\to F]$ is a category whose objects are the morphisms $m:D\to F(M)$ and whose morphisms from $m_1\to m_2$ are given by morphisms $u:M_1\to M_2$ such that the diagram commutes, i.e. such that $m_2=F(u)\circ m_1$

Then a final morphism is the final object in the comma category, and an initial morphism is the initial morphism is the initial object in the cocomma category. If $\mathcal{D}=\mathcal{C}^{\mathcal{I}}$ (i.e. is a diagram category) and $F$ is the diagonal functor, then the comma category is the category of cones, and the cocomma category is the category of cocones.

One might dislike the asymmetry between $F$ and $D$ and decide to go a step further, generalising $D$ to another functor. So given two functors $F:\mathcal{A}\to\mathcal{D}$ and $G:\mathcal{B}\to\mathcal{D}$, we can construct:

The comma category $[F\to G]$ is a category whose objects are the morphisms $m:F(M)\to G(N)$ and whose morphisms from $m_1\to m_2$ are given by morphisms $u:M_1\to M_2,\ v: N_1\to N_2$ such that the following diagram commutes:

The previous definition of comma and cocomma categories then occur when $\mathcal{B}$ and $\mathcal{A}$ respectively are replaced by a singleton (and $D$ is the only object in their image in $\mathcal{D}$).

Examples: free group, image

Abstracting some categorical definitions

Before making any interesting definitions, we need to get something over with: the notions of injective and surjective homomorphisms do not really generalise very nicely to category theory -- the closest definitions are:

A morphism $f:X\to Y$ is a monomorphism if for distinct morphisms $g$ to $X$, $f\circ g$ are distinct.

A morphism $f:X\to Y$ is an epimorphism if for distinct morphisms $g$ from $Y$, $g\circ f$ are distinct.

One may check that all injective homomorphisms are monomorphisms and that all surjective homomorphisms are epimorphisms -- but it's not too hard to see that they are not equivalent. Concrete counter-examples exist even in simple categories like abelian groups.

If anyone has any motivation or insightful explanation of monomorphisms and epimorphisms, let me know. For example, are the non-injective monomorphisms (e.g. the quotient map $\mathbb{Q}\to\mathbb{Q}/\mathbb{Z}$) actually interesting or just something we need to get used to?

There are also related notions:

A morphism $f:X\to Y$ is a section if it has a left-inverse ("retraction"), i.e. a morphism $g:Y\to X$ such that $g\circ f=1_X$.

A morphism $f:X\to Y$ is a retraction if it has a right-inverse ("section"), i.e. a morphism $g:Y\to X$ such that $f\circ g=1_Y$.

It's clear that all sections are injective morphisms (and thus monomorphisms), and all retractions are surjective morphisms (and thus epimorphisms). Of course, these intermediaries are not category-theoretic.

One may define an isomorphism, denoted $f:X\leftrightarrow Y$, as a morphism that is both a section and a retraction. Here are some theorems about it:

An isomorphism has a two-sided inverse morphism.
Given $g_1\circ f=1_X$, $f\circ g_2=1_Y$ -- by considering $g_1\circ f\circ g_2$, we see that $g_1=g_2$.

A morphism that is both a monomorphism and a retraction is an isomorphism.
Since $f$ is a retraction, $f \circ g = {1_Y}$ -- so $f \circ g \circ f = f$. Left-cancelling (since $f$ is a monomorphism), $g\circ f = 1_Y$, thus $f$ is a section.

A morphism that is both a section and a epimorphism is an isomorphism.
Analogous to above.

Subobjects

In our minds, a sub-object is a subset that also carries the structure of that category -- in other words, it's isomorphic to another object in that category.

It's natural to identify the subobjects of $X$ then with injections (or rather monomorphisms) into $X$ (not with the domains of the injections, as they may have multiple embeddings into $X$). But the identification is not one-to-one -- multiple injections may have the same image. In general, two monomorphisms $g_1:S_1\hookrightarrow X$, $g_2:S_2\hookrightarrow X$ having the same image is an equivalence relation, expressible as:

$$\exists i:S_1\leftrightarrow S_2,\ g_1=g_2\circ i$$

So we identify the equivalence classes of monomorphisms into an object with its subobjects.

An alternate motivation for our definition of a subobject comes from the subspace topology (and vice versa), which is defined in terms of continuous inclusion maps. Of course, in our definition here, we allow the subspace topology to be any topology that allows an injective continuous map to the space, but the standard definition in topology asks for the coarsest such topology (i.e. the "least continuous" such map). Later, we will study some refined definitions of a lot of ideas here that may apply better to specific categories.

Quotient objects

The natural way to think of quotient objects in category theory is in terms of the first isomorphism theorem, which states that the quotient objects are the images of surjections from the object -- kinda "dual" to how subobjects are the images of injections into the object (keep this notion of "duality" in mind).

You might be afraid that this kind of defeats the point, since we'd like to eventually prove the First Isomorphism Theorem in category theory. Well, we'll do so with some other more category-specific definition of quotients, etc. so the First Isomorphism Theorem would simply be a demonstration that these two definitions are equivalent.

But once again, just identifying the quotient objects with epimorphisms overcounts them -- a single quotient can map to multiple different isomorphic things. So, as before, we write down an equivalence relation between epimorphisms from $X$: two epimorphisms $g_1:X\twoheadrightarrow Q_1$, $g_2:X\twoheadrightarrow Q_2$ are equivalent if:
$$\exists i: Q_1\leftrightarrow Q_2,\ g_2=i\circ g_1$$
So we identify these equivalence classes of epimorphisms as the quotient objects of an object $X$.

Products

One can take inspiration from the product topology, and think of the product of some objects as the object with the "least information" that still allows morphisms to each object. So we define:

Given a collection $X_i$ of objects, we define their product $\prod X_i$ as a collection of morphisms $\pi_i:X\to X_i$ such that:

For any other collection of morphisms $\pi'_i:X'\to X_i$, $\exists! u: X' \to X$ such that $\pi'_i=\pi_i\circ u$.

Question: what does the empty product look like? This definition seems a bit bad for this purpose. We'll develop some more general machinery in the next article or so.

Sums (aka "coproducts")

Shockingly enough, the "opposite" or "dual" of the above. Direct sums of vector spaces can be seen as the "smallest possible" (i.e. embedding into most possible things, i.e. having most possible information) vector space permitting morphisms from each vector space. Another example would be the disjoint union of sets. Perhaps this is even clearer with the free product of groups, where the free product is the object with the "most information possible" arising from your groups.

Given a collection $X_i$ of objects, we define their sum $\coprod X_i$ as a collection of morphisms $\varpi_i:X_i\to X$ such that:

For any other collection of morphisms $\varpi'_i:X_i\to X'$, $\exists! u: X \to X'$ such that $\varpi'_i=u\circ \varpi_i$.

Because of its "dual" appearance to the product (which we will soon see described more generally), the sum is often known as the "coproduct".

We'll now start to list subobjects and quotient objects related to a morphism. Here's a convenient cheat-sheet: the following diagram is commutative. (dotted lines are zero morphisms, which we will define shortly)

(play with it!)

Images

Next, let's think about the image of a morphism $f$. Once again, we can identify an image with a monomorphism, so we're really looking for a subobject consisting of monomorphisms $g$ with the same image as $f$. So we want an $e:I\to Y$ such that there exists a morphism $f_I: X\to S$ with $f=e\circ f_I$ -- but this is not enough, $I$ may be "too big" (it may contain elements that map to $Y$ not in the image), so we define:

A monomorphism $e:I\hookrightarrow Y$ is the image of a function $f:X\to Y$ if:

$\exists f_I:X\to I, f=e\circ f_I$

For any $e':I'\to Y$ and $f_{I'}:X\to I'$ such that $f=e'\circ f_{I'}$, there $\exists! u:I\to I', e=e'\circ u$.

This relies on the following key lemma of course: the images of $f$ form a subobject. This follows straightforwardly from the second condition.

Zero objects

Many categories have the notion of a zero element in an object -- groups have identities, vector spaces have zero vectors, and let's not talk about rings and fields. And some, like topological spaces, don't.

But we can't talk about elements of an object in category theory. But perhaps the examples above give you an idea -- we have seen trivial groups and trivial vector spaces. Objects comprising only of the zero element -- let's call them zero objects. So maybe we can talk about the images of morphisms from these trivial objects to other objects, and they would represent zero elements.

The idea behind a zero element is that it is "privileged" in some sense -- a morphism must preserve it. Furthermore, every object contains a zero element, so there must always be morphism from the zero object to any object. These criteria fixes a unique morphism from the zero object to any given object. In fact, this idea is captured by the following definition.

A universal object or initial object $I$ is an object such that for any object $X$ in the category, there is a unique morphism $I\to X$.

But there is also another idea behind the notion of a zero object, that it carries the "minimum information" compatible with the category's structure. Recall our interpretation of a morphism as something that either retains or discards information -- mapping an object to a zero object means discarding "as much information as possible".

A final object $F$ is an object such that for any object $X$ in the category, there is a unique morphism $X\to I$.

Finally, a zero object is defined as an object that is both initial and final.

Perhaps a bit surprising at first (but trivially easily), but we can in fact see that the initial and final objects are respectively unique up to (unique) isomorphism -- just consider the unique morphisms between the two objects. The reason it's kinda incredible that we can do this in full generality, is that e.g. we can immediately see that the trivial ring is not an initial object (because the ring of integers is), because it can't be mapped to anything, something that seems like a technicality at first glance arising from the fact that 1 must be preserved by ring homomorphisms.

But if you think about it, the category theoretic argument also comes from the same fact -- it comes from the fact that you can't map the ring of integers to its own zero element, because that doesn't preserve 1. But if you didn't mandate that ring homomorphisms preserve the multiplicative identity, then the ring of integers would no longer be an initial object.

Zero morphisms, kernels and equalizers

When defining a kernel of $f:X\to Y$, we're looking for a subobject of $X$ that maps to the zero element in $Y$. Since in category theory, a subobject is an injection (from an object we'd typically view as isomorphic to the subobject), we're looking for a monomorphism (subobject) $k$ that composes with $f$ to give us a morphism that maps everything to the zero element, whatever that means.

OK -- so what does it mean? What's a morphism that maps everything to the zero element? Recall that we're thinking of the "zero element" as the image of a morphism from the zero object (which really means we're identifying it with a subobject, i.e. an equivalence class of monomorphisms). So we can associate with our desired morphism $o:X\to Y$ the morphism from $X$ to the zero object, which then embeds into $Y$ -- so we define the zero morphism as their composition:
$$o:=X\to O\to Y$$
Where $O$ is the zero object.
$$\dots$$
There is an alternate, more general definition of a zero morphism for categories without a zero object that nonetheless caputres the notion of a zero element.

Here's an alternate way to think about morphisms from and to a zero object. An initial morphism is essentially the "least surjective homomorphism possible". A final morphism is essentially the "least injective homomorphism possible". These are in line with our understanding of the zero object as the "smallest possible" object, or the one that contains the least information.

In line with the way we've thought of injectivity and surjectivity when defining monomorphisms and epimorphisms, we make the following definitions.

A morphism $f:X\to Y$ is an initial morphism (or right-zero morphism, or coconstant morphism) if for any $g,h: Y\to V$, $g\circ f = h \circ f$.

A morphism $f:X\to Y$ is an final morphism (or left-zero morphism, or constant morphism) if for any $g,h: W\to X$, $f\circ g = f\circ h$.

(Exercise: Check that the morphism from an initial object and the morphism to a final object satisfy this property.)

Further, one may observe that for $l:X\to Y$ final and $r:Y\to Z$ initial, $r\circ l$ is both an initial and a final morphism. So the right general notion of a "morphism that maps everything to the zero element" is a "a morphism that is both initial and final", or a zero morphism.

$$\dots$$
Anyway, we're obviously interested in monomorphisms $k$ to $X$ such that $f\circ k=o$. But these don't all represent the kernel -- they could represent subobjects smaller than the kernel. So we define the kernel as follows:

A monomorphism $k:K\hookrightarrow X$ is the kernel of $f:X\to Y$ if

$f\circ k=o_{KY}$

For any $k':K'\hookrightarrow X$ such that $f\circ k'=o_{K'Y}$, $\exists! u:K'\to K, k\circ u = k'$.

More generally:

Given morphisms $f_i:X\to Y$, their equaliser is a monomorphism $k:X\hookrightarrow X$ such that:

$f_i\circ k$ are equal for all $i$.

For any $k':K'\hookrightarrow X$ such that $f_i\circ k'$ are equal for all $i$, $\exists! u:K'\to K, k\circ u = k'$.

The kernel can then be understood as the equaliser of a morphism with the zero morphism.

Of course, these definitions lie on a key lemma: the kernel/equaliser is a subobject -- i.e. that for two kernels $k_1:K_1\to X$ and $k_2:K_2\to X$, there $\exists i:K_1\leftrightarrow K_2, k_1=k_2\circ i$ -- this follows straightforwardly from the second condition in the definition.

Cokernels, coimages and quotienting by a subobject

I once said that one of the points of learning linear algebra was as an introduction to ideas that appear repeatedly throughout algebra. Here's where we really see this in action.

Recall the notion of a left null space $\mathrm{ker}(f^T)$ from linear algebra -- it sort-of represented the "space of constraints" on the image of a morphism, in that it was the orthogonal complement of the image $\mathrm{im}(f)$ in the co-domain. It represents the stuff that the image can't fall into -- or tying back into our understanding of morphisms as things that "cannot create information that wasn't already present in the domain", it represents the information the morphism hasn't created (whether or not it could have), a measure of non-surjectivity.

Well, "orthogonal complement of the image" is not something restricted to vector spaces -- we can understand it more generally, as a quotient. Interestingly, it would then no longer be a subobject, which I suppose is reasonable -- we don't really have a notion right now of what a transpose morphism is in a general category.

In general, though, the quotient we're looking for is not $Y/\mathrm{im}(f)$ -- clearly, that wouldn't exist in a lot of important categories, e.g. groups. That's only true in categories that seem "abelian" in some sense. Perhaps this is unsurprising, because we're thinking of the cokernel as "the information that the morphism has not created", and information is more than just elements of the set.

So the appropriate way to generalise "a quotient by the image" is to look at a quotient object of the codomain $Y$ (which, recall, is an epimorphism from $Y$) that composes with $f$ to produce the zero morphism. But dually in the case of the kernel, we must make sure that our quotient is the full, "most universal" one:

Given a function $f:X\to Y$, we define its cokernel as an epimorphism $\bar{k}:Y\to\bar{K}$ such that:

$\bar{k}\circ f = o_{X\bar{K}}$

For any other $k':Y\to\bar{K}'$, there $\exists! u:\bar{K}\hookrightarrow\bar{K}'$ such that $k'=u\circ k$.

Once again, the latter property shows that the equivalence class of these epimorphisms is indeed a quotient object.

In fact, this "composition forms the zero morphism" notion is the generalisation of the "quotient by an object" notion, or of the so-called exact sequence below. Generally, given a subobject $S$ given by some monomorphisms $s:S\to X$, the epimorphisms $q:X\to Q$ that always compose with these monomorphisms to form the zero morphism $q\circ s = o_{SQ}$ define the quotient $X/S$, if they are a quotient object.

$$O \to \ker f \to X \overset f \longrightarrow Y \to \operatorname{coker} f \to O$$
Oh, and an exact sequence is exactly the idea behind quotienting by an object.

$$\dots$$
Now recall the notion of a row space $\mathrm{im}(f^T)$. Once again, the row space is best interpreted as a quotient object of $X$ (in the case of linear algebra, the quotient by $\mathrm{ker}(f)$). But constructing it in terms of the kernel would clearly be quite complicated -- and perhaps not generally so useful. A better, more "general" approach is to think of an element of the row space as an equivalence class of elements that are mapped to the same element. So we define:

Given a function $f:X\to Y$, we define its coimage as an epimorphism $\bar{e}:X\twoheadrightarrow \bar{I}$ such that:

$\exists f_{\bar{I}}:\bar{I}\to Y, f = f_{\bar{I}}\circ \bar{e}$

For any other epimorphism $\bar{e}':X\twoheadrightarrow \bar{I}'$ and $f_{\bar I'}:\bar{I}'\to Y$ such that $f = f_{\bar I'}\circ \bar{e}'$, there $\exists! u: \bar{I}'\to\bar{I}, \bar{e}=u\circ \bar{e}'$

Which is again clearly a quotient object.

Introduction to category theory: a second-abstraction

When we first started talking about abstraction, we did so by observing the analogies between mathematical objects, such as integers and polynomials, the unit circle and remainders, etc. We spent the rest of the Abstract Algebra I series figuring out why these seemingly unrelated objects had similar behaviour -- what the fundamental properties were that resulted in this behaviour, and making these properties the "axioms" of various abstract algebraic structures.

But you may have later observed that even these various algebraic structures have analogies. For starters, every algebraic structure has the notion of homomorphisms -- things that commute with "structure". Then you have analogous object constructions, like trivial objects, product objects and quotient objects. And then you have the really neat stuff -- stuff like "normal subgroups are the kernels of group homomorphisms" vs. "ideals are the kernels of ring homomorphisms", etc. And perhaps most usefully of all, we've seen that some analogies themselves, such as between features of Lie groups and features of Lie algebras, might have a simpler and more abstract basis than the specific constructions of the theory.

You would be justified to believe that these analogies spring fro;m some shared principles -- and you would be justified to believe that these shared principles ought to be abstracted.

That much like how mathematical objects were found to have similar properties, and we'd categorise them as groups and rings and vector spaces and whatever -- these categories of objects too could have similar properties.

So this will be our approach: without stating the axioms beforehand and only general notions of what a category is and what a homomorphism (or "morphism" in category theory) is, we will try to prove theorems we know about specific categories like groups, for general categories -- and see what axioms we'll need.

(This, by the way, is called reverse mathematics. We've done this often here whenever dealing with something we must be rigorous about, e.g. in Topology.)

And the real idea we should have at the back of our heads is that we should stop thinking of groups, etc. as "sets with additional structure". They're really generalisations of sets, and homomorphisms are generalisations of functions. (I won't go here into exactly what I mean, but a good article to get your head wrapped around this is Sigma fields are Venn diagrams, for an illustration of how measurable functions, the morphisms in the measurable spaces category, are a "generalisation of functions".) So we won't try to force our objects to be sets and give them elements, or force our morphisms to be functions -- they will just be dots and arrows satisfying some axioms. This will require a bit of thinking, e.g. defining kernels without talking about identity elements.

Let's start.