Abstracting some categorical definitions

Before making any interesting definitions, we need to get something over with: the notions of injective and surjective homomorphisms do not really generalise very nicely to category theory -- the closest definitions are:
A morphism $f:X\to Y$ is a monomorphism if for distinct morphisms $g$ to $X$, $f\circ g$ are distinct.
A morphism $f:X\to Y$ is an epimorphism if for distinct morphisms $g$ from $Y$, $g\circ f$ are distinct.
One may check that all injective homomorphisms are monomorphisms and that all surjective homomorphisms are epimorphisms -- but it's not too hard to see that they are not equivalent. Concrete counter-examples exist even in simple categories like abelian groups.

If anyone has any motivation or insightful explanation of monomorphisms and epimorphisms, let me know. For example, are the non-injective monomorphisms (e.g. the quotient map $\mathbb{Q}\to\mathbb{Q}/\mathbb{Z}$) actually interesting or just something we need to get used to?

There are also related notions:
A morphism $f:X\to Y$ is a section if it has a left-inverse ("retraction"), i.e. a morphism $g:Y\to X$ such that $g\circ f=1_X$.
A morphism $f:X\to Y$ is a retraction if it has a right-inverse ("section"), i.e. a morphism $g:Y\to X$ such that $f\circ g=1_Y$.
It's clear that all sections are injective morphisms (and thus monomorphisms), and all retractions are surjective morphisms (and thus epimorphisms). Of course, these intermediaries are not category-theoretic.

One may define an isomorphism, denoted $f:X\leftrightarrow Y$, as a morphism that is both a section and a retraction. Here are some theorems about it:
An isomorphism has a two-sided inverse morphism.
Given $g_1\circ f=1_X$, $f\circ g_2=1_Y$ -- by considering $g_1\circ f\circ g_2$, we see that $g_1=g_2$.
A morphism that is both a monomorphism and a retraction is an isomorphism.
Since $f$ is a retraction, $f \circ g = {1_Y}$ -- so $f \circ g \circ f = f$. Left-cancelling (since $f$ is a monomorphism), $g\circ f = 1_Y$, thus $f$ is a section.
A morphism that is both a section and a epimorphism is an isomorphism.
Analogous to above.


Subobjects

In our minds, a sub-object is a subset that also carries the structure of that category -- in other words, it's isomorphic to another object in that category.

It's natural to identify the subobjects of $X$ then with injections (or rather monomorphisms) into $X$ (not with the domains of the injections, as they may have multiple embeddings into $X$). But the identification is not one-to-one -- multiple injections may have the same image. In general, two monomorphisms $g_1:S_1\hookrightarrow X$, $g_2:S_2\hookrightarrow X$ having the same image is an equivalence relation, expressible as:
$$\exists i:S_1\leftrightarrow S_2,\ g_1=g_2\circ i$$
So we identify the equivalence classes of monomorphisms into an object with its subobjects.

An alternate motivation for our definition of a subobject comes from the subspace topology (and vice versa), which is defined in terms of continuous inclusion maps. Of course, in our definition here, we allow the subspace topology to be any topology that allows an injective continuous map to the space, but the standard definition in topology asks for the coarsest such topology (i.e. the "least continuous" such map). Later, we will study some refined definitions of a lot of ideas here that may apply better to specific categories.



Quotient objects

The natural way to think of quotient objects in category theory is in terms of the first isomorphism theorem, which states that the quotient objects are the images of surjections from the object -- kinda "dual" to how subobjects are the images of injections into the object (keep this notion of "duality" in mind).

You might be afraid that this kind of defeats the point, since we'd like to eventually prove the First Isomorphism Theorem in category theory. Well, we'll do so with some other more category-specific definition of quotients, etc. so the First Isomorphism Theorem would simply be a demonstration that these two definitions are equivalent.

But once again, just identifying the quotient objects with epimorphisms overcounts them -- a single quotient can map to multiple different isomorphic things. So, as before, we write down an equivalence relation between epimorphisms from $X$: two epimorphisms $g_1:X\twoheadrightarrow Q_1$, $g_2:X\twoheadrightarrow Q_2$ are equivalent if:
$$\exists i: Q_1\leftrightarrow Q_2,\ g_2=i\circ g_1$$
So we identify these equivalence classes of epimorphisms as the quotient objects of an object $X$.



Products

One can take inspiration from the product topology, and think of the product of some objects as the object with the "least information" that still allows morphisms to each object. So we define:
Given a collection $X_i$ of objects, we define their product $\prod X_i$ as a collection of morphisms $\pi_i:X\to X_i$ such that:
  1. For any other collection of morphisms $\pi'_i:X'\to X_i$, $\exists! u: X' \to X$ such that $\pi'_i=\pi_i\circ u$.
Question: what does the empty product look like? This definition seems a bit bad for this purpose. We'll develop some more general machinery in the next article or so.



Sums (aka "coproducts")

Shockingly enough, the "opposite" or "dual" of the above. Direct sums of vector spaces can be seen as the "smallest possible" (i.e. embedding into most possible things, i.e. having most possible information) vector space permitting morphisms from each vector space. Another example would be the disjoint union of sets. Perhaps this is even clearer with the free product of groups, where the free product is the object with the "most information possible" arising from your groups.
Given a collection $X_i$ of objects, we define their sum $\coprod X_i$ as a collection of morphisms $\varpi_i:X_i\to X$ such that:
  1. For any other collection of morphisms $\varpi'_i:X_i\to X'$, $\exists! u: X \to X'$ such that $\varpi'_i=u\circ \varpi_i$.
Because of its "dual" appearance to the product (which we will soon see described more generally), the sum is often known as the "coproduct".



We'll now start to list subobjects and quotient objects related to a morphism. Here's a convenient cheat-sheet: the following diagram is commutative. (dotted lines are zero morphisms, which we will define shortly)

(play with it!)



Images

Next, let's think about the image of a morphism $f$. Once again, we can identify an image with a monomorphism, so we're really looking for a subobject consisting of monomorphisms $g$ with the same image as $f$. So we want an $e:I\to Y$ such that there exists a morphism $f_I: X\to S$ with $f=e\circ f_I$ -- but this is not enough, $I$ may be "too big" (it may contain elements that map to $Y$ not in the image), so we define:
A monomorphism $e:I\hookrightarrow Y$ is the image of a function $f:X\to Y$ if:
  1. $\exists f_I:X\to I, f=e\circ f_I$
  2. For any $e':I'\to Y$ and $f_{I'}:X\to I'$ such that $f=e'\circ f_{I'}$, there $\exists! u:I\to I', e=e'\circ u$.
This relies on the following key lemma of course: the images of $f$ form a subobject. This follows straightforwardly from the second condition.



Zero objects

Many categories have the notion of a zero element in an object -- groups have identities, vector spaces have zero vectors, and let's not talk about rings and fields. And some, like topological spaces, don't.

But we can't talk about elements of an object in category theory. But perhaps the examples above give you an idea -- we have seen trivial groups and trivial vector spaces. Objects comprising only of the zero element -- let's call them zero objects. So maybe we can talk about the images of morphisms from these trivial objects to other objects, and they would represent zero elements.

The idea behind a zero element is that it is "privileged" in some sense -- a morphism must preserve it. Furthermore, every object contains a zero element, so there must always be morphism from the zero object to any object. These criteria fixes a unique morphism from the zero object to any given object. In fact, this idea is captured by the following definition.
A universal object or initial object $I$ is an object such that for any object $X$ in the category, there is a unique morphism $I\to X$.
But there is also another idea behind the notion of a zero object, that it carries the "minimum information" compatible with the category's structure. Recall our interpretation of a morphism as something that either retains or discards information -- mapping an object to a zero object means discarding "as much information as possible".
A final object $F$ is an object such that for any object $X$ in the category, there is a unique morphism $X\to I$.
Finally, a zero object is defined as an object that is both initial and final.

Perhaps a bit surprising at first (but trivially easily), but we can in fact see that the initial and final objects are respectively unique up to (unique) isomorphism -- just consider the unique morphisms between the two objects. The reason it's kinda incredible that we can do this in full generality, is that e.g. we can immediately see that the trivial ring is not an initial object (because the ring of integers is), because it can't be mapped to anything, something that seems like a technicality at first glance arising from the fact that 1 must be preserved by ring homomorphisms.

But if you think about it, the category theoretic argument also comes from the same fact -- it comes from the fact that you can't map the ring of integers to its own zero element, because that doesn't preserve 1. But if you didn't mandate that ring homomorphisms preserve the multiplicative identity, then the ring of integers would no longer be an initial object.



Zero morphisms, kernels and equalizers

When defining a kernel of $f:X\to Y$, we're looking for a subobject of $X$ that maps to the zero element in $Y$. Since in category theory, a subobject is an injection (from an object we'd typically view as isomorphic to the subobject), we're looking for a monomorphism (subobject) $k$ that composes with $f$ to give us a morphism that maps everything to the zero element, whatever that means.

OK -- so what does it mean? What's a morphism that maps everything to the zero element? Recall that we're thinking of the "zero element" as the image of a morphism from the zero object (which really means we're identifying it with a subobject, i.e. an equivalence class of monomorphisms). So we can associate with our desired morphism $o:X\to Y$ the morphism from $X$ to the zero object, which then embeds into $Y$ -- so we define the zero morphism as their composition:
$$o:=X\to O\to Y$$
Where $O$ is the zero object.
$$\dots$$
There is an alternate, more general definition of a zero morphism for categories without a zero object that nonetheless caputres the notion of a zero element.

Here's an alternate way to think about morphisms from and to a zero object. An initial morphism is essentially the "least surjective homomorphism possible". A final morphism is essentially the "least injective homomorphism possible". These are in line with our understanding of the zero object as the "smallest possible" object, or the one that contains the least information.

In line with the way we've thought of injectivity and surjectivity when defining monomorphisms and epimorphisms, we make the following definitions.
A morphism $f:X\to Y$ is an initial morphism (or right-zero morphism, or coconstant morphism) if for any $g,h: Y\to V$, $g\circ f = h \circ f$.
A morphism $f:X\to Y$ is an final morphism (or left-zero morphism, or constant morphism) if for any $g,h: W\to X$, $f\circ g = f\circ h$.
(Exercise: Check that the morphism from an initial object and the morphism to a final object satisfy this property.)

Further, one may observe that for $l:X\to Y$ final and $r:Y\to Z$ initial, $r\circ l$ is both an initial and a final morphism. So the right general notion of a "morphism that maps everything to the zero element" is a "a morphism that is both initial and final", or a zero morphism.

$$\dots$$
Anyway, we're obviously interested in monomorphisms $k$ to $X$ such that $f\circ k=o$. But these don't all represent the kernel -- they could represent subobjects smaller than the kernel. So we define the kernel as follows:
A monomorphism $k:K\hookrightarrow X$ is the kernel of $f:X\to Y$ if
  1. $f\circ k=o_{KY}$
  2. For any $k':K'\hookrightarrow X$ such that $f\circ k'=o_{K'Y}$, $\exists! u:K'\to K, k\circ u = k'$.
More generally:
Given morphisms $f_i:X\to Y$, their equaliser is a monomorphism $k:X\hookrightarrow X$ such that:
  1. $f_i\circ k$ are equal for all $i$.
  2. For any $k':K'\hookrightarrow X$ such that $f_i\circ k'$ are equal for all $i$, $\exists! u:K'\to K, k\circ u = k'$.
The kernel can then be understood as the equaliser of a morphism with the zero morphism.

Of course, these definitions lie on a key lemma: the kernel/equaliser is a subobject -- i.e. that for two kernels $k_1:K_1\to X$ and $k_2:K_2\to X$, there $\exists i:K_1\leftrightarrow K_2, k_1=k_2\circ i$ -- this follows straightforwardly from the second condition in the definition.



Cokernels, coimages and quotienting by a subobject

I once said that one of the points of learning linear algebra was as an introduction to ideas that appear repeatedly throughout algebra. Here's where we really see this in action.

Recall the notion of a left null space $\mathrm{ker}(f^T)$ from linear algebra -- it sort-of represented the "space of constraints" on the image of a morphism, in that it was the orthogonal complement of the image $\mathrm{im}(f)$ in the co-domain. It represents the stuff that the image can't fall into -- or tying back into our understanding of morphisms as things that "cannot create information that wasn't already present in the domain", it represents the information the morphism hasn't created (whether or not it could have), a measure of non-surjectivity.

Well, "orthogonal complement of the image" is not something restricted to vector spaces -- we can understand it more generally, as a quotient. Interestingly, it would then no longer be a subobject, which I suppose is reasonable -- we don't really have a notion right now of what a transpose morphism is in a general category.

In general, though, the quotient we're looking for is not $Y/\mathrm{im}(f)$ -- clearly, that wouldn't exist in a lot of important categories, e.g. groups. That's only true in categories that seem "abelian" in some sense. Perhaps this is unsurprising, because we're thinking of the cokernel as "the information that the morphism has not created", and information is more than just elements of the set.

So the appropriate way to generalise "a quotient by the image" is to look at a quotient object of the codomain $Y$ (which, recall, is an epimorphism from $Y$) that composes with $f$ to produce the zero morphism. But dually in the case of the kernel, we must make sure that our quotient is the full, "most universal" one:
Given a function $f:X\to Y$, we define its cokernel as an epimorphism $\bar{k}:Y\to\bar{K}$ such that:
  1. $\bar{k}\circ f = o_{X\bar{K}}$
  2. For any other $k':Y\to\bar{K}'$, there $\exists! u:\bar{K}\hookrightarrow\bar{K}'$ such that $k'=u\circ k$.
Once again, the latter property shows that the equivalence class of these epimorphisms is indeed a quotient object.

In fact, this "composition forms the zero morphism" notion is the generalisation of the "quotient by an object" notion, or of the so-called exact sequence below. Generally, given a subobject $S$ given by some monomorphisms $s:S\to X$, the epimorphisms $q:X\to Q$ that always compose with these monomorphisms to form the zero morphism $q\circ s = o_{SQ}$ define the quotient $X/S$, if they are a quotient object.

$$O \to \ker f \to X \overset f \longrightarrow Y \to \operatorname{coker} f \to O$$
Oh, and an exact sequence is exactly the idea behind quotienting by an object.

$$\dots$$
Now recall the notion of a row space $\mathrm{im}(f^T)$. Once again, the row space is best interpreted as a quotient object of $X$ (in the case of linear algebra, the quotient by $\mathrm{ker}(f)$). But constructing it in terms of the kernel would clearly be quite complicated -- and perhaps not generally so useful. A better, more "general" approach is to think of an element of the row space as an equivalence class of elements that are mapped to the same element. So we define:
Given a function $f:X\to Y$, we define its coimage as an epimorphism $\bar{e}:X\twoheadrightarrow \bar{I}$ such that:
  1. $\exists f_{\bar{I}}:\bar{I}\to Y, f = f_{\bar{I}}\circ \bar{e}$
  2. For any other epimorphism $\bar{e}':X\twoheadrightarrow \bar{I}'$ and $f_{\bar I'}:\bar{I}'\to Y$ such that $f = f_{\bar I'}\circ \bar{e}'$, there $\exists! u: \bar{I}'\to\bar{I}, \bar{e}=u\circ \bar{e}'$
Which is again clearly a quotient object.

No comments:

Post a Comment