Intuition, analogies and abstraction

$$-1=\sqrt{-1}\sqrt{-1}=\sqrt{(-1)(-1)}=\sqrt{1}=1$$
I bet you've seen the fake "proof" above that minus one and one are equal. And the standard explanation as to why it's wrong is that the statement $\sqrt{ab}=\sqrt{a}\sqrt{b}$ only applies when $\sqrt{a}$ and $\sqrt{b}$ are real, or something like that (maybe only one of them needs to be real -- something like that -- who cares?).

But if you're like me, that isn't a very satisfactory proof. Why does the identity not hold for complex numbers? For that matter, why does it hold for real numbers? Well, that is a good question, and one way of answering it would be to try and prove the identity for real numbers, and see what properties of the real numbers (or of the real square root, in particular) you use. And if this article were being filed under "MAR1104: Introduction to formal mathematics", that's how I might explain things -- but that doesn't give us too much insight -- not about square roots and complex numbers, anyway.

Let's think about what $\sqrt{ab}=\sqrt{a}\sqrt{b}$ means.

What does the square root of a real number mean, anyway? It's some property related to multiplying a real number by itself. What does multiplication mean? What does a real number mean? The picture I have in my head of the real numbers is of a line. But what exactly is this line? -- the real numbers are just a set. Why did you put them on this line in this specific way? In doing so, you gave the real numbers a structure, a specific type of structure called an "order", defined by the operation $<$.

But there are other ways to think about/structure the real numbers. One way is to think of real numbers as (one-dimensional) scalings. You can scale things like mass, and volume, using real numbers, representing the scalings as real numbers. Scaling a mass by 2 is equivalent to multiplication by 2. So this gives the real numbers a multiplicative structure, defined by the operation $\times$ (or whatever notation -- or lack thereof -- you prefer). And the "real line" then just represents the image of "1" under all scalings.

So the way to think about square roots is to think of numbers as linear transformations called scalings, and think about the scaling that when done twice, gives you the number you're taking the square root of. So what's $\sqrt{-1}$? What's $-1$? $-1$, multiplicative, is a reflection. What's its square root? Try to think of a (linear!) transformation that when done twice gives you a reflection. It can't be done in one dimension. And can you think of another such transformation? Can you prove these are the only two? Are you sure -- what about if you add a dimension?

So the natural way to think about square roots of numbers that may or may not be complex, is with so-called "Argand diagrams", on the complex plane, the image of "1" under all complex numbers multiplicative.

Click "edit graph" to play with a and b!

To simplify things, consider only unit complex numbers (this is okay, because all complex numbers can be written as a real multiple of a unit complex number and a real number). The product of complex numbers $a$ and $b$ involves rotating by $a$, then rotating by $b$. The square roots of $a$ and $b$ involve going halfway around the circle as $a$ and $b$, and the square root of $ab$ goes halfway around the circle as $ab$.

So it seems like the identity should hold, doesn't it? $\sqrt{ab}$ goes half as much as $a$ and $b$ put together -- this seems to be exactly what $\sqrt{a}\sqrt{b}$ does -- go around half as much as $a$, then half as much as $b$. Isn't $\frac{\theta+\phi}2=\frac{\theta}2+\frac{\phi}2$?

The problem is that $\sqrt{ab}$ doesn't really go $\frac{\theta+\phi}2$ around the circle, if $\theta+\phi$ is greater than $2\pi$. You can see this in the diagram courtesy of Desmos above -- $ab$ has gone a full circle, and its square root is defined to halve the argument of $ab$, but the argument isn't $\arg (ab)=\arg (a) + \arg (b)$, rather:

$$\arg (ab) \equiv \arg (a) + \arg (b) \pmod{2\pi}$$
But halving is not an operation that the $\bmod$ equivalence relation respects -- not in general, anyway. It is not true that

$$\arg (ab)/2 \equiv (\arg (a) + \arg (b))/2 \pmod{2\pi}$$
Instead:

$$\arg (ab)/2 \equiv (\arg (a) + \arg (b))/2 \pmod{\pi}$$
Let's recall from basic number theory -- on integers, the general result regarding multiplication on mods. If $a\equiv b\pmod{m}$, then $na\equiv nb \pmod{nm}$, certainly, and also $na\equiv nb \pmod{m}$ iff $n$ is an integer*. But $1/2$ isn't an integer, which is why only the former result is relevant.

This is also why $(ab)^2=a^2b^2$ does hold for complex numbers.

*when $n$ isn't an integer, we need $na$, $nb$ to be integers for the statement to even be well-defined in standard number theory, and then you have a result for division on mods involving $\gcd(d,m)$, etc. This isn't a concern for us here because we're dealing with divisibility over the reals -- if you want to be formal, a real number is divisible by another real number if the former can be written as an integer multiple of the latter.

So there you have it -- I just demonstrated a very fundamental analogy between two seemingly incredibly unrelated ideas: complex numbers modular arithmetic -- square roots of complex numbers don't multiply naturally, because mod doesn't respect division. It's almost as if somehow, somewhere, somehow magically, exactly the same kind of math was used to derive results, to prove things, about these unrelated objects.

As if they're just two instances of the same thing.

I wonder what that thing could be.



Let's talk about something completely unrelated (no, genuinely -- completely unrelated -- I won't tell you this is an instance of the "same thing" too). Let's talk about logical operators, specifically: do $\forall$ and $\exists$ commute? I.e. is $\forall t, \exists s, P(s,t)$ equivalent to $\exists s, \forall t, P(s,t)$?

You just need to read the statements aloud to realise they don't. To use a classical example, "all men have wives" and "there is a woman who is the wife of all men" are two very different statements (okay, in this case both statements are false, so they're equivalent in that sense, so you get my point).

But let's think more deeply about why they don't commute. What do $\forall t, \exists s, P(s,t)$ and $\exists s, \forall t, P(s,t)$ mean, anyway? $\forall$ and $\exists$ are just infinite $\land$ and $\lor$ statements , i.e. $\forall t$ is just an $\land$ statement ranging over all possible values that $t$ can take and $\exists s$ is just an $\lor$ statement ranging over all possible values $s$ can take.

So $\forall t, \exists s, P_{st}$ just means (letting $s$ and $t$ be natural numbers for simplicity, but they don't have to):

$$({P_{11}} \lor {P_{21}} \lor ...) \land ({P_{12}} \lor {P_{22}} \lor ...) \land ...$$
And $\exists s, \forall t, P(s,t)$ means:

$$({P_{11}} \land {P_{12}} \land ...) \lor ({P_{21}} \land {P_{22}} \land ...) \lor ...$$
This is a bit complicated, so let's instead look at the simpler case where you have only 2 by 2 statements -- i.e. just construct the analogy between $\forall,\exists$ and actual $\land,\lor$ statements.

So the question is if:

$$({P_{11}} \lor {P_{21}}) \land ({P_{12}} \lor {P_{22}}) \Leftrightarrow ({P_{11}} \land {P_{12}}) \lor ({P_{21}} \lor {P_{22}})$$
This is interesting. Maybe you see where this is going. Let me just do a notation change -- I'll use "$\times$" for $\land$, "$+$" for $\lor$, "$=$" for $\Leftrightarrow$" and some new letters for the propositions. Under this new notation, where $\times$ is invisible as always, we're asking if:

$$(a + b)(c + d) = ac + bd$$

Aha! This is Freshman's dream, isn't it? And we know it's not true -- it's a dream, after all, don't be delusional -- and we know why it's not true too.

But wait -- we aren't talking about elementary algebra here. I just gave you some silly notation and made it look like Freshman's dream. But here's the thing: the proof (or algebraic proof -- a counter-example is also a proof, but that isn't so interesting... not here, anyway) that these propositions aren't equivalent is exactly the same as in algebra. We expand out the brackets (because we know that $\land$ distributes over $\lor$ -- we also know that $\lor$ distributes over $\land$, incidentally, something that is not true in standard algebra) and point out that there are extra terms, and point out that these extra terms change the value of the expression (they aren't zero).

So there's some kind of relationship between the boolean algebra and an elementary algebra. A lot of proofs that can be done in one of these algebras can be written almost identically in the other. Not all these proofs, mind you -- then the algebras would just be isomorphic to each other -- but some of them can. Maybe a lot of important ones can.

An abstraction that produces such proofs simultaneously for both elementary algebra and boolean algebra may be more complicated than you think -- there's no real sense in which a statement is "always zero" in boolean algebra. Take for instance, distributivity of $\lor$ over $\land$ -- $a+bc=(a+b)(a+c)$. This is not true in elementary algebra, because the extra term $ab+ac$ is not always equal to zero ($a^2\ne a$ is not really an example, because $a^2=a$ for $a\in\{0,1\}$ -- but $a(b+c)=0$ is not true for all $a,b,c\in\{0,1\}$). It's just that it leaves the value of the existing terms unchanged in this specific instance.



I've just illustrated two examples here -- the first one is a type of group, by the way, but you've probably seen dozens of other such "connections between different areas of mathematics" yourself. I've made these sorts of analogies fundamental to a lot of the articles I've written here (I think). You might've just thought of them as interesting insights, but in reality, abstract mathematics/abstract algebra -- or really just mathematics in general -- is all about these analogies.

In a sense, mathematics is largely about abstraction. I mean, that's not what mathematics fundamentally is -- fundamentally, math is just logic -- but it's how mathematics largely functions. Whenever one talks of axioms, you could think of them as fundamental defining ideas of mathematical objects, and you can also think of them as "interfaces" between mathematics and reality (see my introduction to linear transformations). There are a massive number of different physical phenomena that we can study, and rather than prove everything from scratch for each one of them, it is much better -- and more insightful in terms of understanding the connections between things -- to show that they satisfy a certain set of axioms that apply to a whole range of things, and then deduce that all the logical consequences of these axioms -- all theorems -- are satisfied by the objects.

If we can do that with physical phenomena, we can sure as well do it with mathematical phenomena too -- instead of proving something from scratch for every new mathematical object, we prove that it is a group, or a ring, or a field, or a module, or an algebra, or a topology, or a geometry of some sort, by verifying it matches the axioms -- and then use all the abstract knowledge we have about these things and deduce they must necessarily apply to our new object, because they are logical consequences of our axioms.

Abstract mathematics is, in this sense, all about generalising things by finding the "smallest set of axioms" the thing requires.

(Well, not really -- the most general statement is "true", and everything else is just a logical deduction from this statement. So in that sense mathematics is all about finding special cases. But in order to know what to take a special case of, and what special case that "what" is of "true", you need to generalise.)

List some weird analogies you've seen before in math. Something about divisibility sound familiar?

No comments:

Post a Comment