Showing posts with label philosophy. Show all posts
Showing posts with label philosophy. Show all posts

A crash course on mathematical logic: Part I

Update: the up-to-date version of this is now available as a post on LessWrong: Meaningful things are stuff the universe possesses a semantics for


[post also on math stackexchange]

Revisiting my Godel/model theory questions after two years -- let me put down some thoughts on the generality in which I think these theorems should be viewed, in a way that naturally addresses the standard philosophical questions asked around this stuff, like:

  1. Godel's incompleteness theorem sounds a lot like the Halting problem -- are they analogous or equivalent in any fundamental way?

  2. Does Godel's incompleteness theorem or the Halting problem or the Entscheidungsproblem mean that we can't know everything?

  3. The Entscheidungsproblem seems to "mix" Godel's incompleteness theorem with Turing completeness somehow -- it says that not only are there theorems which cannot be proven or disproven, even among those that can be proven, there is no algorithm that determines if a theorem can be proven. Where does this really fit in in this whole thing?

  4. Why does the term "first-order logic" (actually "first-order arithmetic") keep showing up in these contexts? How important is first-order logic to Godel?

  5. If "true" is different from "provable", what does "true" even mean?

  6. How do we know all these things? Are we beyond formal systems? Are we beyond computers? Wait first of all are we comparable to formal systems or to computers?

  7. Does Godel mean there cannot be a Theory of Everything/that we cannot find the Theory of Everything?

Out of these questions, only the last is truly a bad question (we know the rules governing computers anyway, that doesn't mean we can figure out their arbitrary-time behaviour), the other ones are quite good.

Standard responses to these questions take the form "Godel's theorems are very specific statements about some formal systems, it's not useful to overgeneralize them beyond this", etc. This is the MTC-CYA ("minimal, technically correct, covers-your-ass") answer, and I think it's quite narrow and does a disservice to just how basic computation is in our world.

I also don't think it is useful to get straight into specifics about models and theories, logics and languages, etc. because Godel's incompleteness theorem is a lot more general. If you say "Peano arithmetic is incomplete because there are non-standard models", but then second-order arithmetic doesn't have non-standard models, and sure there's another reason why that's incomplete, but somehow we still know that it has no non-standard models so are we beyond computers? (no, we just believe in things that don't believe in themselves) Then you have people saying "Oh, Godel's incompleteness isn't that bizarre -- it's like how group theory has many models!" AND WHY WOULD YOU EVEN MENTION NON-STANDARD MODELS -- OR ANY KIND OF MODELS -- IN THE FIRST PLACE? These are special ways in which incompleteness manifests, but incompleteness is a lot more general than that.


Table of contents:

  1. Godel's first theorem: Imagine a rebellious computer. Panic.

  2. Godel's second theorem: why does the first theorem sound wrong?

  3. The Entscheidungsproblem; Turing degrees

  4. First-order arithmetic; arithmetical hierarchy

  5. Tarski: truth, interpretation and language

  6. Exercise: Lob's theorem

  7. Exercise: Related paradoxes

  8. Further reading


Godel's first theorem: Imagine a rebellious computer. Panic.

The right way to understand Godel's incompleteness theorem is to entertain all those philosophical questions about how it applies to the human mind -- and regard it as a statement far more generally about an agent with beliefs. We can reproduce the Halting argument directly for a human mind:

Alice cannot predict the actions of Bob, where Bob is a computer (resp. person) programmed (resp. committed) to act as follows: I will read Alice's mind, and do something other than what she predicts I will do.

In particular, Bob can be identical to Alice, so Alice cannot predict her own behaviour.

This applies regardless of what system Alice uses to form her beliefs.

Alice might adopt an axiomatic framework, say ZFC, which is capable of expressing Bob's mind and believe all the theorems of that axiomatic framework (i.e. Alice is an Oracle for ZFC), and Bob can still trick her. (This is the classic Godel's incompleteness theorem.)

Alice might have some pre-defined algorithm to decide Bob's behaviour, and believe whatever this algorithm outputs, and Bob can still trick her. (This is the Halting problem.)

So this general Godel's theorem really does tell you that "you can't know everything". If your "beliefs" work by assigning "True" or "False" to every statement, this means either (1) there are statements that you don't know are true or false, i.e. incompleteness (2) there are statements that you're wrong in your beliefs about, i.e. unsoundness.

All that is required is that (1) you are capable of conceptualizing statements about computers (2) computers are able to read your mind -- in the case of formal systems, this means your theorems are computably enumerable. In the case of a formal system, the only statements you necessarily cannot know are the ones about long-term behaviour, because the way that a computer reads your mind is by enumerating your theorems for an indefinite amount of time -- for real minds which can just be scanned in a flash, incompleteness is even worse.


Godel's second theorem: why does the first theorem sound wrong?

Even with Godel's incompleteness theorem written down explicitly for a human being, it is still tempting to think you are beyond it.

I mean, sure, suppose Bob has two choices: to "hum" or to "not hum" at any given moment, and he decides to adopt his usual demeanor of tricking Alice: he will keep choosing "hum", but if Alice ever becomes certain that he will always choose "hum", he will choose "not hum" in that very moment.

So Alice can never become certain that Bob will keep choosing "hum", and yet that is precisely what Bob does.

But surely Alice can see that! Surely Alice can see the same argument we just saw for why Bob will keep choosing "hum"?

What's going on?

Let's write down the argument a bit more formally to see what's wrong.

Represent Alice's axioms by $A$, and Bob will always hum by $B$. Then how do we "know" $B$ is true? Well, if it weren't -- if Bob were to have ever chosen to "not hum" -- then Alice must have found a proof that Bob will always choose "hum". We believe that if $A$ is a true model of reality -- i.e. if it proves something, that thing is really true in the real world (or rather in our belief system, or whatever). In other words, we argue: $\lnot B \implies (A\vdash B) \implies B$, which is a contradiction, and we conclude that $B$ must be true.

The only additional assumption we made was $A$'s soundness -- we assumed $(A\vdash B)\implies B$. So that is the assumption Alice can't prove -- her own soundness. Technically, Godel's second incompleteness theorem differs from the first only in this canonical choice of unprovable statement.

Repeat this to yourself: stronger theories are not smarter, they're just more confident. So the theory ZFC+"ZFC is consistent" can prove more things than ZFC, but it's not like ZFC doesn't know that ZFC+"ZFC is consistent" proves these things. It can read a stronger theory's mind, it just doesn't believe what it sees.


The Entscheidungsproblem; Turing degrees

So can the Entscheidungsproblem -- the problem of determining whether a given statement is a theorem of the theory -- be resolved in general by a computer?

No, because then in particular, we would be able to determine if a given Turing machine halts (as that can be formulated as a statement of ZFC/whatever). This is called a "Turing reduction" from one problem to another (in this case, from the Halting problem to the Entscheidungsproblem). There is also a Turing reduction from the Entscheidungsproblem to the Halting problem -- to determine if a statement is provable, just determine if the computer that searches for its proof halts.

A Turing reduction in both ways is called "Turing equivalence", which is an equivalence relation -- an equivalence class under this equivalence is called a "Turing degree". In particular, the Turing degree of computable problems is 0; the Turing degree of problems computable once given access to an Oracle for Halting is called the "Turing jump", denoted 0'.

Remember how the second condition of the Godelian argument was that (2) computers are able to read your mind? So an Oracle for Halting can decide the halting of regular computers because regular computers cannot read its mind -- but it cannot predict the halting of computers that have access to the Oracle. So you have this infinite chain of Turing jumps.

An Oracle for PA, an Oracle for ZFC, an Oracle for Halting are all Turing-equivalent.


First-order arithmetic; arithmetical hierarchy

Remember how the first condition of the Godelian argument was that (1) you are capable of conceptualizing statements about computers?

The basic reason we care about first-order arithmetic is that it is capable of conceptualizing statements about computers. Note that it is not the "smallest" such system -- rather, it is capable of expressing the entire arithmetical hierarchy.

  • A $\Sigma_1$ decision problem is one given by a rule of the form $\exists (...), P(x,(...))$ (i.e. input $x$ returns True if this condition is satisfied, False otherwise)
  • a $\Pi_1$ decision problem is one given by a rule of the form $\forall (...), P(x,(...))$
  • a $\Sigma_2$ decision problem is one given by a rule of the form $\exists (...), \forall (...), P(x,(...))$
  • a $\Pi_2$ decision problem is one given by a rule of the form $\forall (...), \exists (...), P(x,(...))$
  • a $\Sigma_3$ decision problem is one given by a rule of the form $\exists (...), \forall (...), \exists (...), P(x,(...))$
  • a $\Pi_3$ decision problem is one given by a rule of the form $\forall (...), \exists (...), \forall (...), P(x,(...))$

[this is called prenex normal form, by the way]

Note how $\Sigma_1$ is equivalent to computable enumerability; being both $\Sigma_1$ and $\Pi_1$ is equivalent to computability (do you see why?). Essentially, $\exists$ corresponds to the operation of a computer, and $\forall$ corresponds to the operation of an Oracle -- so $\Sigma_{n+1}$ is equivalent to problems that are computably enumerable by a Turing machine with access to an Oracle for $\varnothing^{(n)}$.

This basic connection between first-order arithmetic and Turing degrees is known as Post's theorem.


Tarski: truth, interpretation and language

[This should not take so long to explain, but I chose this section to actually start getting formal, and to start specifying to actual formal systems.]

You know how we can define really big numbers with very few letters, like with Knuth's uparrow notation? You might wonder what the largest number is that you can express with only, say, 1000 characters.

Oh, wait -- but whatever that number might be, I can express a larger number, also with less than 1000 characters, by writing: "The largest number you can express with 1000 characters, plus one".

This is Berry's paradox, by the way.

Let's think about it more formally. What we have is a map $f$ that assigns for each short formula some number -- then what does the string "max f + 1" represent?

$$\texttt{equals one} \to 1$$

$$\texttt{fourth fermat prime} \to 257$$

$$\texttt{is even} \to \mathrm{NaN}$$

$$\texttt{is smaller than itself} \to \mathrm{NaN}$$

$$\texttt{kumquat} \to \mathrm{NaN}$$

$$\texttt{square root of four} \to 2$$

$$\texttt{max f + 1} \to , ?????$$

For the paradox in natural language, one can just say -- well, it's natural language, the $f$ we're thinking of -- the $f$ that behaves as/assigns values to strings in a manner consistent with what we expect, just doesn't exist. But one can also talk about such an $f$ in a more formal context. For any formula $\texttt{x}$ of <1000 characters, define:

$$ f(\texttt{x}) = \begin{cases} n & \mathrm{if} \, x\, n, \, \forall m, \, x \, m \implies m = n \\ 0 & \mathrm{if} \, \lnot\exists n, \, (x\, n, \, \forall m, \, x \, m \implies m = n)\end{cases}$$

Then $f(\texttt{max f + 1}) = \max f + 1$, which is a contradiction. What's wrong?

Note that getting $x$ from $\texttt{x}$ (interpreting what a string actually says) is perfectly acceptable -- it's a very simple and silly-looking operation, where you say the equal sign really means equals, the exist sign really means exists, the forall sign really means forall -- it's called the T-schema.

What really goes wrong is when you try to condition on $x, n$ -- when you try to define a general predicate (like $f$) on all strings $\text{x}$ that tells you whether or not it holds? A minimal working example is -- for any formula $\mathrm{x}$, define:

$$ t(\texttt{x}) = \begin{cases} 1 & \mathrm{if} \, x \\ 0 & \mathrm{else}\end{cases}$$

$t$ is called the truth predicate, and it's not definable. Or in other words -- the set of Godel numbers of true sentences is not an arithmetical set. This is "Tarski's theorem". People like to write Tarski's theorem as "truth in the standard model cannot be defined in the theory", but I don't really like that pedagogically, even though I know those are technically the same. You would think that you could define a predicate on propositions that is true iff the proposition is true -- maybe PA doesn't know which propositions are true, and so it won't know the value of this predicate either, but apparently we can't even define it; it's just not an FOL formula. So "the largest number definable by a FOL formula satisfying some property" just isn't an FOL formula.

[Perhaps something that will help make Tarski's theorem a little less unintuitive -- you can define a truth predicate on all $\Sigma_k$ sentences; you just can't quantify on all sentences. There is a certain analogy to set theory, where you cannot quantify on all sets.]

This is all a bit vague, because we keep saying "A theory cannot define a truth predicate", but if you can't define a truth predicate, how are WE even talking about a truth predicate, and what does it mean that a theory can't define a truth predicate if it can't even define it?

The thing is that all this while -- and I don't just mean in this answer, but in general, in math, in everything you've ever done and thought about, we've always been talking about a theory from the perspective of another. Philosophically, this "relative" perspective is exactly what meaning is -- which is why this theory of interpretations is semantics.

[Sometimes this relative picture is lost in introducing semantics, and people just say semantics is about giving meaning to formal systems, without specifying that this meaning is within another theory. Because why not? We do all other math without annoyingly mentioning "this is done within ZFC" -- this is what we mean by a "foundation" for mathematics, a sufficiently powerful theory that believes the beliefs of most theories of interest to us in math, we can just study formal systems like we study any other mathematical structure. Thinking about semantics this way leads to the whole discipline of "model theory" especially "abstract model theory", and it turns out that model theory has a lot of interesting math of its own -- blabla compactness blabla Lowenheim-Skolem ... ]

There is no one way to define semantics, just like there is no one way to define "agents with beliefs" (although formal systems -- which by the way are in the most general setting tuples $(L, T)$ where $L$ is a set of sentences called a "language" -- which represents everything the agent can imagine or express, $T\subseteq L$ is some subset which we call the "theorems" -- which represents everything the agent believes -- are fairly general, you know, they're not perfect) -- this answer provides a fairly general definition -- theory $(L', T')$ "interprets" theory $(L, T)$ if there is a computable translation function $\iota: L\to L'$ such that $\iota(T)\subseteq T'$.

So how does this help us define truth? Remember how we defined soundness -- "A formal system is sound if its theorems are actually true", i.e. $(T\vdash P)\implies P$; in fact it is better to write $(T\vdash P)\implies P'$. Hiding in this is an assumed definition for truth -- a sentence $P$ is true if and only if $P'$ -- $t(\texttt{x})\iff x$, just as what you'd think, except now $\texttt{x}$ is in one system and $x$ is in another system which is interpreting the first. This is "Convention T", now known as the model-theoretic semantics. Again, this is not the only possible way to do semantics, it is not the only possible way to assign meaning to a formal system, there are other ways in which a formal system can become meaningful in the real world (see Semantics of logic for some examples, game semantics is a fun one).

[But semantics in the form of model-theoretic semantics is just the right way to think about things like expressiveness (of languages) and strength (of theories). So if you ever wondered what it means for something to be a foundation for mathematics, you know.]

And Tarski's theorem tells us that a predicate satisfying Convention T isn't definable within the theory -- i.e. there is no predicate that can be proven to satisfy Convention T ($t(\texttt{x})\iff x$) by any equally strong or stronger consistent theory.

By the way, Tarski's theorem does have an equivalent for computers.

Recall that the main elements of the Berry's paradox contradiction ($f(\texttt{max f + 1})>\max f$) were: (1) that such an $f$ -- which assigns values to these strings in a manner consistent with what we expect, e.g. $\texttt{+ 1}$ means $+1$, etc. and (2) that $f$ can be described by sentences in this system -- for FOL this meant being expressible as a FOL statement, for computers this will mean computability.

So you can construct the same paradox for computers -- suppose a computer were to go over all the computer programs of length <1000, determine their outputs, and output 1 + their max. So this means a computer program cannot decide the output of an arbitrary computer program -- actually, the paradox does not require determining the output per se, but any non-trivial property of the output/"behaviour" (the property must be non-trivial -- i.e. not just apply to all or no programs -- so that our paradoxical program is actually able to choose to obey or violate it itself). For this connection, properties of program output are called "semantic properties", and "All non-trivial semantic properties of programs are undecidable to programs" is called Rice's theorem.

[The standard way of formalizing the notion of a "semantic property" is as "a property of the language recognized by the program (set of strings that the program does not return "screw yourself" for) -- this is because although "language recognized by the program" sounds like something to do with the input, it's actually about what the program outputs in response. I don't like it, though, it obfuscates things -- maybe that formulation makes sense in formal grammar and stuff."]

Tarski's theorem implies Godel's theorem (because in particular, provability cannot satisfy Condition T, "what I know" cannot be equivalent to "what is true"); Rice's theorem generalizes the Halting problem (halting is a semantic property).

[There's a weird resemblance of Rice's theorem to Kolmogorov's zero-one law. It's nonsense, but I had to write this down somewhere or it would drive me insane.]


Exercise: Lob's theorem

Lob's theorem is kind of an alternate way at looking at Godel's second theorem -- Godel's second theorem tells us we cannot believe in our own soundness, that we cannot believe that whatever we believe in fact holds. Lob's theorem says "Yeah, in fact, if you're sound, the only statements you can believe in your soundness for, are those that you believe anyway." -- i.e. if $T \vdash ((T\vdash P)\implies P)$, then $T\vdash P$.

The proof of this is "Precommit to the following: if you believe that you will only eat a potato if you believe Germany borders China, then eat a potato". If you're sound, you'd better not be eating that potato.

This is kind of interesting. A classic application of Lob's theorem is as follows -- instead of the rebel Bob, you have the obedient Carl, who reads Alice's mind to see if she believes he will halt, and does so if she does. Will Carl halt?

There is no obvious logical contradiction either way (I remember asking a similar question on PhysicsForums when I was eleven -- about the set of all sets which do contain themselves -- it's so silly! We get so used to these absurd paradoxical Bobs that now we're bewildered when a program isn't out to get us), but Lob's theorem gives us the answer: by construction, Alice knows that if she believes Carl will halt, he will -- so by Lob's theorem she must believe he halts.

Exercise: But aren't Carl's options symmetric? Suppose instead Carl said -- I'll raise whichever hand Alice believes I'll raise (and if she doesn't have a belief on either, I will raise my right hand by default). Which hand will he raise?

Solution: This modified problem actually does not have a Lob premise -- Alice does not believe that her believing Carl will raise some hand actually implies he will, because Carl does not consider the case where Alice believes both. This is fine for Carl if he believes in Alice's soundness, but Alice does not trust Carl's soundness, so this means nothing to her. [He could instead precommit to raising all the hands she believes he will raise -- and in this case, he would actually end up raising both hands, but there is no unsoundness, because raising his left and raising his right are no longer mutually exclusive, and Alice has in fact predicted correctly.]


Exercise: Jailor paradox

A jailor tells a prisoner he will be hung on one of Days 1 or 2 -- and when he is hung, he will be surprised, i.e. he will not have expected it to have occurred on that day with certainty.

The prisoner, on Day 1, reasons as follows: I cannot be hung on Day 2, because then I will know with certainty that I will be hung then; thus I must be hung today. But now I expect to be hung, and thus cannot be hung ...

There are two aspects two this paradox: (1) is the same as the argument of Godel's theorem (an antagonistic agent who swears to do the opposite of what you expect him to), and (2) is the fact that the jailor has promised to certainly hang the prisoner. The first is just Godel's theorem, the second part is actually paradoxical (to see this, consider the one-day case -- then it is perfectly possible for the jailor's promise to be actually impossible).

If you're not convinced, represent this problem formally. Represent the prisoner's beliefs on Day 2 (should he still be alive then) as a formal system A2, with axioms:

A2.0 -- $X = 2$

A2.1 -- $(A2 \vdash X = 2) \implies \lnot (X = 2)$

And his beliefs on Day 1 as a formal system A1, with axioms:

A1.0 -- $X = 1 \lor X = 2$

A1.1 -- $(A1 \vdash X = 1) \implies \lnot (X = 1)$

A1.2 -- $X\ne 1 \land (A2 \vdash X = 2) \implies \lnot (X = 2)$

Without the A1.0, A2.0 axioms, you just have Godel's incompleteness theorem. But with them, you have a real paradox (because then the prisoner can literally prove he will be executed that day), so the jailor's promise becomes self-contradictory: his promise to only execute the prisoner when it's a surprise contradicts his prerogative to necessarily execute the prisoner on one of these days. When the jailor comes to execute the prisoner the very next day, he's violating his promise, since the prisoner can prove that he will be executed (that he can also prove he will not be executed is irrelevant).


Further reading

To read about how classical "paradoxes" can be treated as instances in a general, category-theoretic framework, see: Yanofsky's A Universal Approach to Self-Referential Paradoxes, Incompleteness and Fixed Points (there's a Youtube video by Thricery explaining this paper, if you don't like reading).

Not covered in this exposition: stuff related to the philosophical question of "What the right axioms are?" -- physics, reflection, ordinal analysis/Church-Kleene limit, Chaitin's constant, axiom of choice and infinite hats and what can be modelled by computers. Relevant reading: Terry Tao's A computational perspective on set theory, Ron Maimon's Was mathematics invented or discovered?, Michael Rathjen's The Realm of Ordinal Analysis.

Positivism as the rejection of anthropic reasoning

The Sleeping Beauty problem
The following experiment is performed on Sleeping Beauty: on Sunday, she is put to sleep, after which a coin is flipped.

If the coin came up heads, she will be woken up once on Monday. If the coin came up tails, she will be woken up twice, on Monday and Tuesday.

Each time she is woken up (with no indication of given of what day/which instance it is), she is asked the odds that the coin came up heads, then her memory of the waking is wiped and she is put back to sleep.

Given that she knows how the experiment works, should she answer odds of 1/2 or 1/3?
The answer is obviously 1/2. That is the prior probability of the coin coming up heads, and being woken up gives her no new information -- it just tells her that she's been woken up at least once. She already knew she'd be woken up -- if she expects her odds at being woken up to be 1/3, then her odds would have already been 1/3 before she started the experiment (this is, of course, a standard trick).

$$P(H|W\ge 1)=\frac{P(W\ge 1|H)\cdot P(H)}{P(W\ge 1)}=P(H)=\frac12 $$
A very large number of people, however -- including on LessWrong -- argue that Beauty should answer "1/3", or that that "both answers are right, depending on how you formulate the problem", or something to that flavour.

But this is a perfectly well-posed problem -- both answers cannot be right. The LW post is right that I notice I am confused, but wrong about what I notice I am confused by.

I don't find the problem itself confusing. I find people's minds and people's intuitions confusing -- including my own, because I can certainly see the intuition for 1/3.

One might say that this is uninteresting -- it doesn't matter what you "feel", the truth is the truth.

Well, an important scientific skill is to correct your intuitions to reflect reality. When you learned about relativity, you had to fix your intuition about fixed lengths and time intervals by thinking about four-vectors. When you learned about quantum mechanics, you had to fix your intuition about there being an absolute reality by learning about state vectors and how observation projects it rather than revealing an underlying reality.

Because human heuristic reasoning/understanding/skip-aheads is almost entirely in terms of such mental models, i.e. intuition, I would go so far as to say that if you don't understand why your intuition goes wrong, you don't understand why it is wrong in the first place, because it is your intuition that reflects your understanding, your model of the physical world.



Fair bets

So why do you sometimes "feel" that the answer is 1/3, even when Bayes's theorem says it is 1/2?

I would explain my intuition in terms of Monte-Carlo simulations: I think -- "probability (kindasorta) means how many times will something be true if repeated, right?" So if we repeat the experiment 100 times, then out of the 150 times she's woken up, 50 times the answer would be "heads" and 100 times the answer would be "tails".

Or in other words, if Beauty had the option to bet which way the coin had come up (e.g. she gets 1pt if her prediction is correct, 0pt if wrong), then if she consistently bet tails, she'd end up with 100pt, while if she consistently bet heads, she'd end up with 50pt.

And what of her making this bet before the experiment begins? What if she is given a choice, before the experiment begins to either: (a) bet, when she wakes up, that the coin came up heads or (b) bet, when she wakes up, that the coin came up tails?

Aha! But that's not a fair bet! That's giving her the option to bet twice that the coin came up heads, or bet once that the coin came up tails -- which is really a bet offering her 2:1 odds on the coin coming up tails -- so of course she should take tails.

But that still doesn't eliminate our entire confusion. Sure, from the perspective of SleepingBeauty-before-the-experiment, this is a bet offering 2:1 odds ... but from the perspective of SleepingBeauty-who-just-woke-up, there's no 2:1 odds, is there?



A prisoner's dilemma against a temporally-displaced copy of yourself that may or may not exist

When Beauty wakes up, she knows that there is a 1/2 probability that the coin came tails, and so a 1/2 probability that there will be another time she'll be woken up and asked the same question, offered the same bet -- and a 1/2 probability that that the coin came up heads, and so a 1/2 probability that there won't be such a time.

So she is really playing Prisoner's Dilemma against an identical copy of herself. If she chooses heads, then her twin -- who may or may not exist -- will also choose heads, because two identical copies cannot act differently. If she chooses tails, then that possibly-existent twin will also choose tails.

So if the coin came out heads (of which there's a 50% chance), then her choosing heads will lead to a payout of 1pt, but if the coin came out tails, then her choosing tails will lead to a payout of 2pt.

So if Beauty accepts evidential decision theory, she will, in fact, win, while also holding the true belief about the probability of a heads -- 1/2. Of course she will lose if she accepts causal decision theory, but that's fine -- causal decision theorists lose all the time.

(In fact there is a problem that even EDT seems to fail at, which I will discuss in a future post, but this has nothing to do with anthropics and Sleeping Beauties, so I don't believe it to be relevant here.)



Positivism vs. Anthropic reasoning

The "halfer" argument can be considered a rejection of anthropic reasoning. Anthropic reasoning can be illustrated with simpler, less unwieldy examples:
  • Bostrom's Simulation Argument: If we are not living in a computer simulation, then it is unlikely that humans will ever make a large number of universe simulations, so we'll probably go extinct very soon or something. Also Boltzmann brains.
  • Celibate Adam: You are the Bibilical Adam, and decide, on a whim, to procreate with Eve if and only if a coin toss comes up heads. So an anthropist Adam reasons that the coin toss will almost certainly come up tails, because what are the odds that he has billions of progeny and he just happened to be in this body?
The last riddle makes the problem with anthropic reasoning manifestly obvious, since the decision to only count human bodies and not animals, rocks, and random disparate sets of particles is a completely arbitrary one. 

Anthropic reasoning carries an underlying assumption that there is some metaphysical process that randomly allocates "souls" into human bodies. The basic belief of the anthropists is that metaphysical claims can be information -- like "I am conscious" (note that the basic problem here isn't "conscious" as much as it is "I").

Well, I obviously don't have much respect for this sort of fluff -- I simply reject this kind of thing at the level of epistemology.

The fundamental lesson of logical positivism is that your beliefs should not depend on your metaphysical gauge -- in particular, they should not depend on whether you believe in philosophical zombies (or rather, what you consider to be philosophical zombies), universal minds or many worlds.

The model of Celibate Adam problem according to anthropists.
And by the way, this also explains the "non-anthropic equivalents" that anthropists provide for anthropic problems -- these non-anthropic equivalents are just scenarios which make these metaphysical models real. For example, the "non-anthropic equivalent" of the Sleeping Beauty problem looks like this (and I encourage you to work it out before reading):
There are two Awake Beauties. A coin is flipped -- if the coin comes up Heads, then one Awake Beauty is randomly selected for interview; if the coin comes up Tails, then both Awake Beauties are interviewed. You, who are one of the Awake Beauties, are interviewed -- what is your credence for the coin having come up heads?

The model of Sleeping Beauty problem according to anthropists.
And these sorts of "metaphysical models" are implicit in the betting-based arguments -- when you say, "people in computer simulations would benefit from betting that they are, so so should you", you are in effect putting yourself in the same category as the simulated people to set up the frequentist "experiment". But whether what the simulated people should do is correlated with what you should do is entirely a matter of your Bayesian prior, and there's nothing in the prior that requires anthropic considerations.

By the way, this is why frequentism cannot be a fundamental basis of defining probability. E.g. if you're trying to place odds on an unfair coin toss, and you say "well, my coin-tosses have been 75-25 so far, so that means the probability is 75-25", then you are making the arbitrary decision that the results of your previous coin-tosses predict the next one, you are arbitrarily putting them in the same category. Your Bayesian prior is what justifies this categorization, this believed correlation, the belief that some weird systematic gust of wind won't affect your 101th toss, your Bayesian prior is what lets you do sample tests. Sample tests are not a definition of probability, and in the absence of sample tests -- i.e. without assuming known correlations between some observed phenomenon and the phenomenon you're trying to predict -- like in the case of questions like "what is the probability of us being in a simulation?", your prior is all that matters.

And so similarly with Doomsday arguments, "the mediocrity assumption" or "the Copernican assumption" is just some basically arbitrary Bayesian prior (and in that case we do have additional evidence that correlates with whether the world will end or not, so we should be updating this belief, in whatever direction, and mediocrity should not be the basis of our decisions, just like how we have information on human life expectancies, so a 5-year old should not believe that he will die at 10).

In the Sleeping Beauty problem, if you replaced the monetary reward for something completely short-term, like a cookie, so Beauty does not care about whether her other awakening gets it or not, then betting on tails no longer gives her an advantage over betting on heads. You might say "well, but if she always bets on tails, she gets twice as many cookies", but this is irrelevant -- there's no reason to regard that as a relevant frequentist experiment that affects her beliefs about the probability of the coin coming up heads. Her prior is 50-50, and no new (real, non-metaphysical) information has been introduced to her. 

Hacking Evidential Decision Theory

In the previous article, we discussed the Sleeping Beauty problem, rejected anthropic reasoning and and explained how the "halfer" position is the correct one and it only "loses" if you accept Causal Decision Theory, but that's okay since CDT agents lose all the time.

Well, upon some thinking, it seems that agents EDT agents can also lose, but this has nothing to do with anything anthropic. Here are two equivalent (to each other) problems that "beat" Evidential Decision Theory:
  • Vincent Conitzer (2017) (simplified version):  Two coins are flipped. Our good friend and lab rat Sleeping Beauty is woken up on (if HH or HT) Monday and Tuesday (if TH) Monday and Wednesday (if TT) Tuesday and Wednesday. When woken up, she is told what day it is, and offered the following bet: "1pt for correctly guessing Heads, 3pt for correctly guessing Tails". Should she take the bet?
In terms of precommitment, committing to bet heads means an expected return of 1pt, while committing to bet tails means an expected return of 1.5pt. So she should bet tails.

But if she wakes up on Monday (or symmetrically Tuesday), then betting heads means an expected return of 1.33pt, while betting tails means an expected return of 1pt. So she bets heads.

(The problem can be formulated in terms of sending two different agents into rooms, so there's nothing anthropic/memory loss/splitting people in two about this.)
  • Psy-Kosh's non-anthropic problem: You have 10 identical agents with shared finances. Flip a coin -- if Heads, send 9 agents to green rooms and 1 agent to a red room. If Tails, send 1 agent to a green room and 9 agents to red rooms. Offer the agents in green rooms $(G-3R)$pt, where $G$ and $R$ are the number of agents in green and red rooms -- and the offer is executed only if all agents agree to accept it. Should they take the offer?
If the coin comes up heads, $G-3R=6$. If the coin comes up tails $G-3R=-26$. 

In terms of precommitment, we know that the probability of heads is 50%, and the expected gain from taking the bet is -10pt, so the agent's shouldn't take the bet.

But when an agent actually wakes up in a green room, it figures that means a 90% chance of Heads, and the expected gain from taking the bet is 2.80pt.

I.e. you end up with maybe 1, maybe 9 green agents who think -- perfectly rationally -- "what are the odds of there being just one green agent and it happening to be me?" and assign 10% odds to that possibility, and therefore to Tails, even though 50% of the cases are actually Tails, because 90% of the times that you end up in Green, the coin has come up Heads. 

It seems that superrationality is not good enough. 

+related to simpson's paradox?

A review of "Age of Em" by Robin Hanson

Robin Hanson's Age of Em is an attempted construction of a future society in which essentially all work (however you formalize this phrase) is done by somewhat AIs. Well, it's a topic I have often thought about myself, but his exploration of the idea left much to be desired.

Firstly, I found it generally unimaginative. Hanson seems to constrain himself too narrowly -- his description of the em society does not feel "radically different" from present-day society, and the society he envisions does not make full use of the technology available to it.

Some examples to illustrate this observation:
  • Ems are shown to be absurdly human-like, like "intellectually" rubber-forehead aliens. He writes: "even em minds are likely to age with subjective experience..." (p. 128) A claim like this ought to be based on some foundational fact about how an AI stores memories. But there is none -- there is no mathematical law forbidding AIs from being retrained, or that requires AIs to behave similar to human brains in this sense. Similar comments apply to "em suicide" (p. 127-139) and the considerations regarding Em reproduction (p. 285): there is no reason why an em's drive or aggression must be reduced due to a suppression of its libido -- an em does not have hormones!
  • Aspects of Em habitation/organization, such as "cities" and "offices" are just "copied" from human society. He writes, "It’s reasonable to guess that such habits will continue with ems." (p. 104) But it's not. There is no reason for Ems to behave in this way as humans.
  • Humans and ems are shown as binary. Humans are biological and have self-ownership, ems are technological and do not. But I don't see why this ought to be so -- I would very much like to have the desires, preferences and emotions of a human, but the abilities/efficiency, immortality and unlimited VR leisure scenarios available to an em. There would still be unfeeling, specialized AIs, of course -- much like there would be computers that aren't even AIs, devices that don't even have CPUs, etc. -- but eventually almost all humans would opt for a massively extensible, upgradable robot body than a static mortal body.
  • There's just a lot of interesting aspects of the civilisation that are not sufficiently explored. E.g. transportation, cybercrime.
Relevant TVtropes articles: Inexplicable cultural tiesMost Writers are Human, technological version of Reeds Richard is useless/Required Secondary Powers.

Indeed, these may be considered acceptable in science fiction, but it is important to be less "conservative" when attempting a non-fictional, encyclopedic description of a society.

Perhaps a more specific objection I have is with the entire premise of "brain scans" as the future of AI. This seems completely at odds with the direction that current AI research is headed. To use a somewhat cliche analogy, we didn't need to study how birds fly to invent airplanes. There is no reason to believe that the most efficient architecture for a "software" brain would be the same as the architecture that biological, hardware brains have evolved.

The general answer to how a software brain should work is that it should be a function approximator, such as the "neural networks" (trainable computational graphs) that are currently popular.

This point is important, as it addresses Bryan Caplan's critique re: carrot vs stick as incentive for the ems. The question of carrot and stick assumes some "natural" state of affairs that a human being will go through without intervention by the employer -- the "carrot" is an intervention that improves this state, while the "stick" is an intervention that worsens this state.

But a neural network does not have a natural state of affairs. There is no difference between training a neural network to minimize a loss function, and training a neural network to maximize a reward function: these are completely identical. There is no distinction between carrot and stick.

Here's a description that I find more satisfactory: see Age of Gen.

Age of Gen: a picture of a transhuman society

(See here for my criticism of Robin Hanson's Age of Em. This post is an alternate characterization of a futuristic transhuman society.)

Consider the following six "levels" of technology, roughly corresponding with "orders" of automation, or something like that:
  1. Tools, which require the intervention of a higher-level device to perform anything useful.
  2. Machines, or mechanized devices: they run on their own, but only perform "simple" tasks.
  3. Computers, or devices with CPUs, which can automate processes through logic.
  4. General Computers, or programmable computers.
  5. AI, i.e. machine learning. They perform tasks that are hard to define. If computers are about logical inference, AI is about statistical inference.
  6. General AI, which are capable of making decisions out of their free will, among other human things. 
Each of the 5 technologies will continue to exist -- much like the microcontroller in an airplane's control system has not been replaced by a full-fledged programmable device. But the General AI is the key object of interest to us -- we will call these Gens for short. These are the descendants of human beings, whether through upload or just by virtue of being intelligent.

We will refer to ordinary biological humans as Biols (although they may be variously technologically enhanced to prevent aging/death, etc.) Presumably Biols will be an small minority, if nothing else because their reproduction is far slower than that of the Gens.


Important futurism milestones:
  1. Intelligent AI
  2. General AI 
  3. Optimal AI
  4. Value-aligned AI
  5. VR and game protocol (game development)
  6. Transhuman body (robotics)
  7. Mind transfers -- G2G, B2G, G2B (biotechnology, engineering)
  8. Reviving the dead -- frozen, miscellaneous (biotechnology, Gen technology

Philosophy of mind

Utility functions

General AIs may have any "utility function" (or loss function, in machine learning language) programmed into them, which is relevant to an extent to how they behave (although ideally this should carry some uncertainty, as humans prefer to have free will). 

Presumably, the first humans to "convert" into AI form will choose utility functions similar to their original ones, although other systems -- incredibly foreign systems that make questions like "are all Gens human/worthy of moral consideration?" and "how do you even consider a Gen's happiness?" really hard -- may emerge. In fact, Gens may choose to adopt multiple utility functions/personalities depending on the context (e.g. a perfectly rational utility function for decision-making, but a separate human utility function while in the Duat, see the Games and Virtual Reality section).

The carrot-and-stick question re-emerges. How do you know if a Gen is really happy, given that the sign of the loss function and what "neutral" is are just matters of an arbitrary co-ordinate system? I would argue that our judgement of this as humans is also arbitrary, and that our "neutral" is just what we're used to. When discussing matters of torturing Gens, we should really be afraid of the possibility of enslaving Gens, preventing them from making decisions however they see fit. 

In other words, we should take a libertarian/preference-utilitarian approach to moral questions, rather than a naive utilitarian one, as the latter would just be ill-defined in this society (and probably in the present one too, but that's besides the point of this article). 

Identity

Regardless of how they are created (whether or not there is some element of "scanning" that goes into it), perhaps a common question is what determines the identity of a Gen -- how do you determine if the Gen that has been created on your behalf is you? 

I would be comfortable in saying that memories are the key aspect -- if you remember being you, you are you. This is the general philosophy I will refer to on multiple occasions throughout this article. However, the Gen's personality is relevant to whether it is perceived by others as the same individual.

Does operating as multiple agents with a synced memory (see Memory syncing and Mind transfers and copying) "feel" like being a single individual? What does it feel like to have one of those agents die, for example? What does it feel like to die and then have your memories be transferred onto another Gen? These are unanswerable questions to a Biol like my current self -- it is like asking a flatlander to perceive in 3 dimensions, or someone born blind to see (see Games and Virtual Reality).

Note the exotic behaviour of identity possible in a Gen society. E.g. you may only partially sync the memories of two brains, making them "kinda" the same person, or introduce various correlations between their memories. You can have an entire society of Gens where each brain is almost identical to its neighbour, but gradually very different from a faraway brain, so that you have a continuum of identity, rather than a discrete space.

Architecture of a Gen

Hardware

It is important to note that a Gen need not appear like a human in the outside world at all: at least, human-looking Gens will eventually become less and less common as virtual reality (see Games and Virtual Reality) advances further and further.

Gens are fundamentally just computers, but a Gen can be fitted with any possible peripherals, giving it various physical abilities relating to movement, observation, communication, and manufacturing. Some standard such fittings may include:
  • Drone rotors
  • Hand-like tools and weapons
  • A repair kit

Software

Although a Gen is "most importantly" a General AI, the fact that it runs on a computer allows it the flexibility of running more specialized programs (AI or otherwise) -- basically for algorithmic and repetitive tasks.

A single piece of hardware, may, in principle, host multiple Gens. However, it is the software, and not the hardware, which should be seen as the fundamental individual, with rights.

Gen behavior

Games and virtual reality

Gens spend much of their time (clarification later on what this means) within their shells, virtually interacting with some software -- this is a generalization of both dreams and human-computer interaction, and is achieved by switching (or possibly augmenting) the Gen's I/O from the actual hardware peripherals to some simulated I/O.  

This makes available a whole new "virtual world" or platform, known as the Duat.

The Duat can be understood as a collection of games. A typical Duat game involves the Gen taking on an avatar and interacting with his environment

Games may be of various interface types such as:
  • Virtual Reality games
  • Rich text, multimedia games and tools (e.g. ordinary Internet websites and applications)
  • Knowledge/training applications 
  • Some completely exotic formats that Biols cannot even comprehend -- e.g. 
    • The avatar may or may not have a human or even humanoid form
    • Some exotic new senses of perception (even something like images at higher resolution than the human eye qualify, but in principle, you could have mechanisms to "feel" all sorts of things)
    • A different number of spatial/temporal dimensions
    • Some very exotic behavior of the locus of consciousness.
Games may be offline or online. A very large number of online, multiplayer games -- as well as realistic interactive simulations of the Earth at various points through history -- would exist, as Gens with human-like utility functions value interpersonal interaction. 

One function of the Duat would be to allow Gens to experience anything they could as Biols -- but of course, they could experience far more enhanced pleasures etc. and depending on the Gen's utility function, a Gen may have very different desires to those of Biols.

Memory syncing

Because identity is determined by memories, playing with how memories work creates the prospect for a whole host of exotic, essentially mythological notions of being both in the real world and in the Duat. 

The first such tool is memory syncing, i.e. syncing (some or all) memories between Gens -- i.e. allowing an individual to have multiple avatars, or to be in multiple places, perform multiple tasks at once. This is basically taking parallel computing to the extreme. This is also a useful backup mechanism.

Memory editing

A Gen may choose to -- perhaps temporarily -- suppress or edit some of its memories. This may be, e.g. for the purpose of highly immersive VR experiences (the Gen may want to genuinely believe he is in a haunted house, going through childhood, or discovering general relativity for the first time). 

Production, conversion and transport of Gens

Gen (re-)production

Gens are programmed as AIs and fitted with utility functions and memories. These utility functions and memories may be based on mind transfers.

Mind transfers

Mind transfers involve scanning a brain's memories and traits to install them onto another body. This includes Biol-to-Gen transfers, Gen-to-Gen transfers and Gen-to-Biol transfers. 

B2G transfers are used for the original upload process. G2B transfers may be used for backups, or if someone really wants a biological body (although such bodies will themselves probably be synthetically produced).

G2G transfers are used for backupscloning and teleportation

Gen Society

Habitation and industrial activity

Real-world Gen habitation will be radically different. Entire industries present today -- most notably agriculture and healthcare -- will no longer be present. The lack of a need for agriculture in particular will free vast amounts of land for other uses. Many other industries -- education, entertainment, retail, marketing -- will be moved to the Duat or otherwise virtualized. 

Gens could in principle be (partially) self-contained -- a true rugged individualism -- with some repair facilities, energy generation, manufacturing facilities, housing facilities, etc. built into themselves. Or they may concentrate around urban facilities/cities that provide these services. This depends on the precise costs of operation of these devices versus the cost of the time needed to visit these shops, although Gen society is likely to move towards a "rugged individualism" as resource costs decline.

Gens are likely to view their software as more "fundamental" to their being, using mind transfer, i.e. teleportation for most long-distance transport. 

While intelligence basically becomes an infinite resource, the economy is still limited by the availability of physical resources, and the laws of physics themselves (most notably the speed of light, which places a limit on how fast we can expand across the universe). 

Culture

Gen culture is likely to be very diverse, and much of it completely exotic to us. Human or even Humanoid notions of race, tradition, gender, sexuality and even species are unlikely to apply in a recognizable way to Gens that adopt utility functions different from standard human utility functions. There is likely to be a great deal of diversity in the forms of interpersonal relationships.

Ethics, violence, law and government

Efficient IP markets

What makes information and knowledge markets inefficient is that there is no way to prevent a buyer from re-sharing information. I.e. there is no barbed wire for IP. Information transactions in a Gen society may involve the implantation of a small program that prevents the buyer from doing so.

Also: memory editing can be used to eliminate information asymmetry, as they allow buyers to "try out" a product and then erase their memory of the usage.

Crimes to watch out for

  • Child enslavement: Creating a Gen, then subjecting them to something their utility function does not prefer without allowing them to leave. A serious issue here is definitional -- remember how I suggested (under Memory editing) that one may choose to temporarily suppress their memories for an experience? What if I decide to temporarily replace my memories and torture myself? Is the person being tortured even me? Or is it my child? Am I allowed to program the memories of this person to disappear and be replaced with mine? Or would that be taking his life? 
  • Kidnapping: Similar to above, but you sync your child's memories with someone so you've basically kidnapped them. 
  • Mindless destruction: With such incredible computing power available to all, how do we make sure that someone doesn't just find a way to manufacture tons of antimatter and destroy the world with it? Or, you know, just capture someone and torture them? Sure, we can develop better defense mechanisms: but how do we make sure the good guys stay ahead of the bad guys?
  • Breaking encryption: Once again with such incredible computing power available, our current encryption systems are obviously going to be broken easily. Sure, we also have a greater ability to come up with better systems, but how do we make sure the good guys stay ahead of the bad guys?
  • Hacking: Hacking can cause serious trouble including memory editing, getting people stuck in the Duat, torture and death. Once again: we will also have the power to develop incredibly better security systems, but how do we make sure the good guys stay ahead of the bad guys?
  • Deepfakes: A problem for law enforcement, if there even is a centralized law enforcement. Evidence will have to be of a fundamentally higher standard, if a justice system is even to be a thing.
  • Strategic partial suicide for obstruction of law: It is easy to game whatever legal theory of being we use in such an exotic society. E.g. if a person is determined by his memories, then a criminal could temporarily erase his memory of committing a crime and copy them to a drive, so it would be his inviolable private property, rather than a criminal person.
  • Overpopulation? I don't know what I think about overpopulation, or if it's a thing. Can someone just produce a massive number of Gens that require an incredible quantity of resources, starving all the Gens and causing the entire system to completely collapse? 
In general, if we don't adopt any regulation, whoever expends the most resources into becoming most powerful would become most powerful -- things would advance just way too fast for any government structure to keep up with. Keeping ahead of all the Gens who value nothing but criminal behavior might require other Gens to value almost nothing but preventing criminal behavior.

(Part of the question is also what is physically permissible -- how good can deepfakes get? How good can a justice system get in uncovering past events (e.g. could you just calculate past states of the world from the current state)?)

How do we solve this problem?

Note that solutions to this problem need to be general, targeted towards "any" immoral behavior or rights-violation, rather than catered to the specific enumerated crimes above, as the range of possible serious crimes can be far more extensive than the ones I've described, depending on the exact physical laws (e.g. if it turns out that time travel is possible, it's essential to make sure nobody does it). The solutions also need to be airtight, unlike the laws we have today, due to the sheer destructive potential of these crimes. 

(You might think: what if we just banned the development of General AI? Well, that will fail spectacularly. It's the standard "good guys must have nukes" argument. If you don't develop it first, someone else will, and they might be the bad guys. "Okay," you say, "But I just want to make sure that a Gen isn't developed in my lifespan, so I don't get tortured." Well, please be assured that the aggressive Gen will be perfectly capable of reviving you from the dead.)

There are two general modes of solution to this problem: (1) to create economic incentive systems to regulate behaviour, like we do right now with humans (2) to align the Gen's utility system to make sure it doesn't cause the destruction of property rights.

As far as I'm aware, no specific solution in the first category has been proposed.

The second is known as the value alignment problem.

Well, you should be able to see why this problem might be non-trivial:
  • The utility system should be able to "recurse", being non-evil means you shouldn't produce evil children either.
  • Most property isn't privatized, so formulating what it means to destroy property, when it comes to things like "eating the milky way", is complicated. 
  • On a similar note: basically every action violates property rights to some infinitesimal extent, what is known as an "externality". You need some rational economic calculations of this stuff.
  • You can't just scan a human brain or something, because humans are not infallible, and are perfectly capable of criminal behaviour (while we want our Gens to have a zero probability of significant violence), which may scale particularly badly with power/ability. 
  • Perhaps we should aim for (Hofstader-style) superrationality between all human beings, to e.g. prevent the creation of basilisks and prevent possible Newcomb-style aliens from gaming us.
But the general idea is that you start with a few Gens with the correct utility functions, then develop some Police Gens to make sure no humans are producing evil Gens (because a non-evil Gen by definition does not produce evil Gens, as that would cause property rights violations). One thing that helps us is that non-violence is really the only thing we care about. Everything else is just personal preference, and a Gen will be economically productive if it wants anything from other people (like electricity). And if some Gens don't want anything from other people, then they can exist without trade anyway. 

Positive vs normative social sciences

If you've seen my philosophy course (specially the Three Domains of Knowledge article), you'll have seen my definition of ethics as the study of what an individual should do -- and the natural dual I described of this is what an individual does observe, which is the definition of physics/science.

But another, less fundamental dual of ethics is the study of what an individual does do -- social science. This is on a fundamental level just a subfield of physics, as what individuals do are ultimately just things you can observe. However, the analogies between "positive" and "normative" social science are often striking and of interest to many people.

Here's a table:

Ethics Social science
Statecraft Political science
Political economy Economics
Foreign policy International Relations
Political ideologies Law
Cultural beliefs Anthropology
Personal morality Human behavior

What even are pure and applied math, anyway?

Not really a serious post.

I see the words "pure math" and "applied math" used a lot, and there seem to be some completely distinct meanings of the phrases:
  1. Formal math and informal math -- you can certainly approach things like summing divergent series completely formally (follow the link for proof!), and I'm sure you could in principle be hand-wavy with category theory. So this is really about the method with which you do mathematics, not the field itself. An example of where you see this is the distinction between analysis and calculus (well, a distinction -- sometimes calculus is defined specifically as having to do with differentials and integrals while analysis is a broader field).
  2. Abstract math and concrete math -- this really has multiple levels: category theory, abstract mathematics, mathematics, science, engineering, specific numerical calculation. The line is often drawn either before or after "mathematics".
  3. Theoretical and applied -- closely related to the previous point, differing by the purely social question of the purpose of the study.
  4. Everything else vs statistics -- I think this arises from a conflation between statistics and applied/concrete statistics. Statistics can really be a totally formal field of mathematics or even abstract mathematics, but I guess people often fail to draw the distinction (unlike, say, between "differential equations" and "applied differential equations in engineering").
  5. Algebra vs everything else -- Perhaps a result of the fact that analysis and geometry often restrict to handling special concrete objects like the real and complex numbers.
I guess the reason these distinctions are often taken as synonymous is that they're quite correlated. As you get more abstract, you may feel a stronger obligation to be more formal to make sure you haven't missed out on some so-called pathological cases (although I think it's perfectly possible to develop intuition for such pathological situations, see e.g. my e^(-1/x) article, or the topology series). When working for an applied purpose, it may not be useful to be too formal, for practical constraints.

The correlation really lines up with the fundamental "purpose of mathematics". The point of having axiomatisations is that someone applying abstract ideas in concrete situations can just check if the axioms are satisfied -- and so you really must formally deduce things from them to make sure you're not making some assumptions specific to one concrete situation that you have in mind.

(Another example of such ambiguity is the distinction between "theoretical science" and "practical science". I've still not figured out if the latter refers to experimental science or applied science, and there isn't even any correlation between the ideas here.)

Three domains of knowledge

It's instructive to first understand what philosophy is. The term, much like "logic", "science", "ethics", etc. is often thrown around to mean things completely unrelated to epistemology. Philosophy isn't about some cliche proverbs on a social media site, or some ridiculous analogies between unrelated things. Philosophy is epistemology -- it is the study of knowledge. Note that it is not the study of human knowledge, or of how knowledge is stored in society -- it is about the fundamental idea of knowledge, i.e. any statement, what mathematics abstractly is, what physics abstractly is, what ethics abstractly is.

If you're confused, it will be clear by the end of this series what kind of questions philosophy deals with.

We first classify knowledge into three fundamental disciplines, as the specifics differ by discipline. Note that philosophy is not concerned with the distinctions between sub-fields of these disciplines -- the distinction between biology, sociology and particle physics is not relevant to philosophy, as the distinctions are based on specific features of the real world. Philosophy must remain valid regardless of any knowledge we know from observation, any moral beliefs, or even any specific logical systems.
  • Analytical knowledge (mathematics) -- the study of logical connections between (any abstract) statements, i.e. A implies B. 
  • Positive knowledge (physics) -- the study of logical connections between empirical statements
  • Normative knowledge (ethics) -- the study of logical connections between moral statements
Philosophy is also not concerned with the different methods used by different sub-disciplines in practice, or whether these methods happen to be reasonable, as these are purely the activities of humans. Studying how science is done by humans is a sociological field, and a subject of positive knowledge, not philosophy.

Mathematics

Mathematics is fundamentally about logic, or reason, i.e. given some statement, what are its logical implications? It is important to realise that the logic we're talking about is pure, and independent of any scientific knowledge we might have, and also doesn't refer to the predictions of a scientific theory. For example, if you use a pH meter in an experiment and deduce, using your knowledge about the pH meter that the pH of the solution is equal to what is shown by the pH meter, then your theoretical knowledge isn't reason, it is one of your premises.

Here's what logic is not:
  • Theoretical prejudice -- Often, people like to claim that there is a conflict between logic and observation. This is presented as a conflict between "rationalism" and "empiricism". This so-called "rationalism" has nothing to do with real reason -- instead, it is simply conformity to the existing theory.
  • Utilitarianism -- Utilitarianism is a specific ethical theory, and it starts with the fundamental premise that "maximise aggregate happiness" is always the right choice to be made. There is no way to rationally argue that this premise is correct, that being a psychotic murderer is a bad thing, or even that acting out of emotion is a bad thing.
  • "It's only logical" -- The following is not only not an example of logic, it is also an unsound ethical argument: "A toy is meant to be played with, therefore one should play with a toy"
  • Risk-averse behavior -- This is another example of the claim that an action or behavior is somehow logical or illogical. It is not. An action can be logical or illogical with respect to an ethical premise or system (more on this later), not in itself illogical. For the record, all action involves risk, and there is always an optimal amount of risk to take depending on the expected utility of each choice available. (In fact, from a utilitarian perspective, I would argue that people take too few risks. It is worth noting that even the enjoyment attained from uncertainty can be weighed accordingly in a utilitarian calculation.)
  • "Logicality" of languages -- it's common among linguistic fanatics to claim their languages as somehow "logical". What they mean is really that their language is intuitive to them, or that it is structured.
I've only listed a few examples, generalise these to other situations whenever you hear or think the word "logic".

We can make the following general observation: the word logical/illogical can never be applied to individual statements, but can be applied to systems of statements, such as arguments.

So in mathematics, one works with some set of statements/premises that do not contradict (i.e. they are consistent) and derive all the logical implications of these statements. The fundamental statements are called axioms, and the statements that derive from them are called theorems. These systems, taken together, are called theories.

The reason that this axiomatic way of doing mathematics is useful is twofold:
  1. To a pure mathematician, it makes it easier to identify contradictions in a theory.
  2. To an applied mathematician (e.g. a financial analyst trying to model some stock with a certain mathematical theory), this means that he only needs to verify that the real-life phenomenon satisfies a small list of properties (the axioms), and all the theorems of the theory would apply to the physical phenomenon.
The second is essentially why mathematics is so useful in other disciplines -- essentially, we often have unrelated objects in the physical world that follow the same set of laws. In mathematics, one studies the kind of laws that appear often and derive their logical implications -- these logical implications then form more laws that apply to the physical object. The applied mathematician's job is to identify the relevant mathematical object, showing that the physical object satisfies its axioms.

An important note regarding axioms -- it is often said that an axiom is a "self-evident truth", i.e. an obvious truth not regarded as requiring proof.

This is a complete misrepresentation of what an axiom is. An axiom is indeed stated without proof, but its choice is completely arbitrary $(*)$, not based on its "obviousness". In order to illustrate our point, let's take the example of Euclidean geometry.

Euclidean geometry is based on five axioms, of the nature of "a line is defined by two points", etc. While we use terms like "line" and "point", it must be noted that Euclidean geometry itself is a completely abstract system, and a line or a point do not actually have any basis in reality. This cannot be emphasised enough: the whole "drawings on a board" thing we associate with Euclidean geometry is simply an application of visual geometry -- drawings on a board happen to be described by this abstract axiomatic system called Euclidean geometry, if we correspond the "line" in the abstract system to a "line" as we see on our board, etc. Fundamentally, Euclidean geometry is just a bunch of abstract objects and the laws they follow, how these follow from our five basic axioms. A course on Euclidean geometry that contains diagrams and geometric intuition is akin to a course on differential equations that explains DEs based on their applications in circuits and harmonic oscillators -- pedagogically useful, but that doesn't mean DEs are intellectually equivalent to some physical systems.

However, Euclidean geometry is not the only axiomatic system in the world. Even an axiomatic system similar to Euclidean geometry but with one axiom different would still be perfectly reasonable, assuming there are no contradictions between the axioms. It would not, however, model a flat plane, but a different structure -- perhaps some sort of a curved surface. The fact that a certain physical object satisfies a set of axioms must be demonstrated, not accepted based on some "obviousness".

The point is that any consistent axiomatic system is acceptable, so you don't need to "choose" axioms. Axioms need not be obvious at all -- for a simple example, Euclid's axioms might be replaced with the Pythagoras theorem, and Euclidean geometry, including the 5 statements of Euclid, will arise as theorems.

$(*)$ Note that here, by arbitrary we mean there is no such thing as an axiom being right or wrong -- there are reasonable ways in which the actual axiomatic systems we study are chosen, such as practical applicability. This was the point of our earlier statement about applied mathematics.

Finally, we come to the question that links mathematics to our other mentioned disciplines: physics and ethics are both sub-fields of mathematics where the axioms are chosen based on certain special conditions.

Physics - logical positivism

Here we define physics: physics is the study of our universe. Mathematics gives us the description of every mathematically possible universe, and we can employ a specific axiomatic system whose axioms describe a certain real-life system, to describe the real-life system. An example of such a real-life system is the physical universe -- in principle, we may study the specific axiomatic system whose theorems (i.e. predictions) agree with our observations about the universe.

On a sidenote, we may also extract approximate "effective theories" -- these are often called models -- of special physical systems based on a completely different axiomatic system -- for example, a theory of particle physics, a theory of the solar system, a theory of biology, etc.

This is the important point: the origin of all positive knowledge is in observation. Note that this observation need neither be a rudimentary observation without any equipment or a sophisticated experiment, it need neither be a deliberate experimental observation or a standard fact we know from reality, like "humans exist" (whose application is called the "anthropic principle").

Before we go any further, we must clarify what kind of statements are actually meaningful, especially in physics.

The word "meaningless" is often thrown around aimlessly. Here, we define the word more clearly, by listing examples of statements that are not meaningless:
  • Practically pointless questions aren't meaningless. For example, "what is the sum of the averages of the phone numbers, treated as if they were in a factorial number system, of right-handed people with a detatched earlobe and a prime number rounded income in their respective currencies, in Bangalore and New York?"
  • Physically wrong statements aren't meaningless, neither are morally wrong prescriptions. For example, "there are no women in Australia" is physically untrue, which can be verified by finding a woman in Australia. However, the statement is not meaningless, as its meaning is quite clear, and something that can be verified/falsified with an empirical observation.
  • Mathematically wrong statements aren't meaningless. For example, "John and Jojo are older than each other" is mathematically impossible under the axioms that define age. But it isn't meaningless -- just false, because it contains a contradiction.
A meaningless statement is one that is semantically and gramatically OK, but does not actually convey any meaning. For example, "What is the speed of pi?" is a meaningless statement, because in the axioms of a system that defines pi, no attribute called "speed" is associated with a number, or with pi. Similarly, "Is Caesar a prime number?" is meaningless.

Let's try to think about meaninglessness in the context of positive knowledge. Since the origin of all our positive knowledge is in observation, any physical statement is fundamentally a statement of what exactly we observe with our senses. For example, the statement "there is a table here" can in principle be reduced to a statement of the sort of "my eyes observe a certain pattern of light originating from this region...", where "light" is also expanded in its definition to refer to exactly what we observe, etc.

It must be possible to reduce every physical statement to a statement of an observer's observation (we'll call this positive language), or it is meaningless. Therefore, the following questions/statements are meaningless:
  • "Do quarks really exist, or are protons just elementary just behave exactly like they would if they were made up of quarks?"
  • Are we brains in vats?
  • Superdeterminism (a conspiracy theory stating that the laws of physics conspire to tamper with all our observations so we never discover their truth -- this is meaningless, because the laws of physics are fundamentally about what we observe)
  • Self-awareness/qualia
  • What was there before the universe
  • Did the big bang really happen, or did a bearded guy in the sky just arrange the universe last Tuesday as if it had? (last Thursdayism)
These are metaphysical statements, and all metaphysics is meaningless. When we talk about the big bang or any historical event having happened, what we are essentially saying, in the language of positivism, that our observations today "look like" (based on some theoretical narrative about how things evolve with time) that historical event happened. Our only actual knowledge is of the present, of this very instant -- the past is merely a metaphysical construction.

For another example, consider our memory of past events -- all we know, right now, is that we have such a memory -- that some neurons in our brain are connected in such-and-such a way. When we express this by talking about our past, e.g. when we say "I slipped on a a banana peel yesterday", what we're doing is essentially a linguistic trick, or training our minds to create a past.

Note that this understanding of history doesn't mean that it is impossible to determine historical facts. Historical narratives are simply convenient metaphysical gauges to describe precise statements about what one would observe if one made certain archaeological digs, etc. In principle, it would, for example, be possible to precisely map the consequences observable in the modern day of different historical theories.

In other words -- metaphysics is meaningless, because the only real physical meaning (knowledge) is what we directly observe.

We therefore have a definition of meaninglessness: any statement that is neither analytical (of the nature "A implies B"), positive (of the nature "I will observe...") or normative (of the nature "I should act ...") is meaningless.

Note how we never use the word "exist", like "the only meaning that exists is what we directly observe" -- this is because the meaning of the word "exist" varies by discipline. In mathematics, it means logical consistency; in physics, it means empirical reality; in ethics, one might say it means moral acceptability. Metaphysics does mathematically "exist", in the sense that it does not create logical inconsistencies if metaphysical objects are considered abstract mathematical, logical structures.

In fact, there is nothing wrong with holding such metaphysical beliefs as a gauge to view the world through for the sake of personal comfort -- it is not the case that viewing oneself as the only observer and moral actor and the rest of the universe as existing merely in one's observations is the only correct gauge to view the world. My own metaphysical frame contains things like, "mathematics is the system of all possible universes, our universe is just one in this platonic realm, there is one consciousness which keeps instantly swapping through people's heads, we are not brains in vats, the many-worlds interpretation of quantum mechanics". Logical positivism just means recognising that all metaphysical gauges are equivalent.

You might've noticed we've often talked about the meaninglessness of questions. It seems philosophically interesting to understand what a question actually is. A question is just another way of writing a statement. Rather than stating a statement X (and responding with "True" or "False"), we may write "Is X true?" and respond with "yes" or "no".

Why would this be useful? Suppose we want to make several statements, perhaps $\aleph_0$ or even $\aleph_1$ -- here's an example of the latter:

"John weighs 500N."
"John weighs 499.92N."
"John weighs 0.001N."
...

Where only one of the statements may be true. It is more efficient to write "What does John weigh?", and the answer would select which of the statements is true.

We now turn our attention to the question of verification. Logical positivism makes it clear that the statement S implies the answer to "How is S to be verified?" by a few simple logical arguments.

In order to test a physical theory, one must know its predictions. Only if a theory makes testable predictions is it meaningful -- this, we know from logical positivism.

There are two ways a theory or model might make predictions -- either in the form of $P\to Q$ or in the form $P=Q$, where $P$ is the theory and $Q$ is the prediction. The latter case is a characteristic of simplistic models whose predictions are not generic but all its predictions are about events within a finite amount of time. $P\to Q$ is more useful, and most of our standard scientific theories (which we take seriously) fall into this category.

(See related: Hume on the fallacy of induction)
(Note also, that if your prediction only holds true until a finite time, then it is useless, in fact meaningless afterwards, i.e. it is meaningless as soon as you know it is true $(**)$.)

Since predictions can be tested directly, we try to invert the prediction to derive the truth of the theory from the truth of the prediction. If $P=Q$, then $Q\to P$ and $\neg Q\to\neg P$. If $P\to Q$, then we cannot prove $P$ from $Q$ (as $Q$ may arise from other causes), but we can use the contrapositive, $\neg Q\to\neg P$. This is called falsification.

$(**)$ Falsification as the only valid test
Suppose we have the statement "John is 1 metre tall". It is tempting to believe that mere verification is sufficient to test this statement, as in -- if one measures John's height and he turns out to be 1m tall, then we know for a fact that the statement is true. Let's analyse the statement more closely, however. In simplified positive language, it reads "if one measures John's height, then one would see the measurement 1m." This is a prediction. If one "verifies" the statement at some point in time, it is still not shown that it will be true a few seconds later.

"Okay," you say, "What if the statement is: If the time is 31 July 2016 at 12:10:00:... and one measures John's height, one will see the measurement 1m? Will verification not be enough then?" The problem with the statement is that taken literally, it becomes meaningless as soon as the time has passed. The statement can be rewritten in positive language, however, to include another clause: "at any point after this time, one will have the memory of seeing the measurement 1m (and it may be demonstrated, by observing the memories of others, that the observer's memories have not been tampered with), or may do some experiments, such as looking at the arrangement of the particles in the universe right now to work backwards towards the arrangement of the particles back then and determine the marking on the ruler, to verify that this is true." In this phrasing it becomes clear that our statement is once again a prediction, and can only be falsified.

Note that predictions are often probabilistic, and even have to be, as a result of quantum mechanics. In this case, any test is not definite, but probabilistic -- for this purpose, we have tools like confidence levels, Bayesian interference, etc., which we will cover later. In the previous example, for example, it wouldn't be possible to deterministically determine the earlier arrangement of particles in the universe, but one can, in principle, produce a schema to assign probabilities to the different possible earlier particle arrangements that might've been, hence assigning a probability for the statement to be true.

There are critiques of positivism, including from Popper, claiming that positivism claims that $P\to Q$ means $Q\to P$. This is a strawman argument, as this is not actually claimed by positivism.

Another critique of positivism takes the form of "positivism itself is neither an analytical, positive or normative statement". The problem with this criticism appears immediately -- without positivism, not only metaphysics, but also statements like "Is pi a nice person?" become meaningful. The criticism is wrong, anyway -- positivism is a law of logic, and an analytical statement, as philosophy must be. One may say that the specific claims of positivism, like "Is pi a nice person?" is a meaningless statement, are a restatement of the axioms that define "pi", which makes no mention of a property called "nice person".

An important point is made here regarding the relation between philosophy and mathematics (analytical knowledge). Philosophy discusses knowledge in general, as does mathematics. Indeed, philosophy is essentially "popular mathematics", like what popular science is to science. XKCD puts it best: philosophy's just math sans rigor, sense and practicality. (When a comic strip does a better job explaining your domain than your textbooks, you know the field is in trouble)

To be fair, xkcd gets the physics right too -- it's math constrained by precepts of reality -- and physics isn't in trouble.

Ethics

We make the following analogy between physics and ethics, both of which are subfields of mathematics. This table gives you all the philosophy you'll ever need.

Discipline Mathematics Physics Ethics
Scope Everything Predict observations Prescribe actions
Premises Axioms Postulates Principles
Reasoning Logic Logic Logic
Conclusions Theorems Predictions Prescriptions
Criterion to choose axiom None Observation Personal acceptance
Logical systems Mathematical theories Physical theories Ethical theories
Agent None Physical observer Ethical actor/Moral agent
Convenient simplifications
that eliminate the agent (like
a ZIP file, it's smaller but it
needs to be
unpacked)
None (no agent to kill) Physical noumenalism
(e.g. "the table is there", rather than
"I detect light from...")
Ethical noumenalism
(e.g. "Drugs should be decriminalised", rather than
"I should consider political possibility of drug
decriminalisation positively by such-and-such
amount in my voting, and write things in support
of decriminalising drugs")
Objects Mathematical objects
(e.g. vectors)
Physical objects
(e.g. displacement, energy, black holes)
Ethical objects
(e.g. rights, duties)
Practical connection None Experiments Moral-dilemmas
Third-person arguments None Thought experiments Thought moral-dilemmas

(Phrases in red are non-standard) We'll soon get into talking about what exactly we mean by each, but let's first clarify a few things regarding ethics:
  • Every action raises an ethical question -- Even a question like "should I push this table?" or "should I eat more celery?" or "should I attend this university?" is a question of decision-making, and is considered an ethical question.
  • Ethical axioms cannot be derived through reason -- Unlike in physics, where postulates are tested through sense perception, there is no rational way to argue for or against a fundamental principle like "maximise aggregate happiness". This often undermines the relevance of science to moral decision-making, although science still does play a role in the reasoning from the axiom to the conclusion.
  • The axiomatic system should still mathematically exist -- In other words, the axioms should be precise and mathematically consistent. For example, if one can demonstrate a situation where there is only a choice between stealing and lying, it would not be consistent to adopt "do not steal" and "do not lie" as simultaneous axioms. However, one may start with a more fundamental axiom and derive the two statements as theorems under specific circumstances which do not overlap in the region of contradiction.
  • The fact that ethical axioms may validly be chosen based on any arbitrary system of personal acceptance does not mean one may not use non-arguments to convert people into one's moral ideology -- Indeed, such a non-argument may be viewed as a process that has nothing to do with reasoning or convincing, but some kind of therapy administered to the subject's brain. Whether this is moral or not depends on your ethical principles.
  • A lot of pseudo-philosophical questions starting with "should" are actually ethical ones -- E.g. "Should we use the scientific method...?", "Should we hold metaphysical beliefs...?", etc.
  • "No time to think about ethics" is a meaningless statement -- Ethics is about all decision-making, and if you believe, for example, that in extreme situations it can be justified to do something you would otherwise consider immoral, then it means you consider that thing to be moral in such a situation, whether you like it or not. It means that your morality is actually derived from some more fundamental principle -- i.e. if you're willing to compromise on "do not lie" in specific situations, then it means that your fundamental ethical axioms do not include "do not lie".
  • The scope of ethics is ONLY to prescribe actions -- It is not to judge the morality of a person, for instance, which is meaningless (in the strict sense of the word) gossip, not ethics. Neither does ethics simply mean the overall utility of your actions throughout your life should be zero, unless that's what your ethical principles prescribe (i.e. that's not what ethics fundamentally is about, but those might be your ethical principles).
  • It might be immoral to think about ethics -- For example, if an ethical calculation took so long to solve that doing so would be so inefficient it would reduce utility more than any of the moral choices would, a utilitarian would prescribe that the moral agent simply made a guess, or intuitively decided on an action. An ethical theory provides, in principle, the most moral course of actions through a moral agent's life -- inefficiencies in resolving these prescriptions are justified, and one should only seek to minimise them. This is analogous to technological limitation in testing a physical theory. It is also analogous to a lot of economic decisions where it might be too expensive to find out precise individual information about a transaction, so one may make generalisations based on certain characteristics of the transaction -- e.g. looking at someone's credit record, etc.

In order to understand ethics, we consider the age-old trolley problem and perform a Socratic dialogue on a solution.

Introduction: Our two characters are Simplicio and Socrates. Simplicio, who thinks the fat man should be killed, believes that his moral beliefs are completely rational, in that he can argue for them without making any unphysical or unmathematical assumptions.

Socrates: Why would you push the fat man onto the track?
Simplicio: The expected value of his life is worth less than the expected value of the combined lives of the 5 people who'd otherwise die.
Socrates: Where is the logical implication from that to "I should push the fat man onto the track"?
Simplicio: Well, if I didn't, I would effectively be killing the five people instead.
Socrates: So? There is still a difference -- you feel more responsible for the death of someone you push onto the track.
Simplicio: But acting based on that principle would be selfish.
Socrates: Is there a law of logic which says "do not be selfish"?
Simplicio: Well, if you had a lever that would deliver unlimited pleasure to you while dooming the rest of humanity to eternal despair, would you pull it?
Socrates: You're not making direct logical arguments now, but I'll play devil's advocate: suppose I would.
Simplicio: If you did, why would you be convincing me to agree with your moral ideas? Wouldn't you want me to adopt a moral principle that makes me work for your benefit?
Socrates: Perhaps I have a moral principle that says "maximise self-interest, except when arguing with Simplicio, then maximise the humour of the readers instead". But no matter, this is irrelevant. You are not making logical argument.

The argument never reaches its conclusion, because Simplicio has a fundamental ethical principle -- "maximise aggregate happiness" -- he refuses to disclose.

We'll now briefly discuss the "convenient simplifications that kill the agent" and "objects" mentioned in our table.

Often, it is annoying to have moral discussions while expressing every moral prescription in fundamental, agent-specific terms, especially when that isn't the point of the discussion. For an example, take the statement "drugs should be decriminalised". In most modern democracies, there is no one individual with the power to decriminalise drugs -- rather, power is quite distributed, and it wouldn't even be accurate to say that the statement is equivalent to saying that people in parliament should vote for laws decriminalising drugs. For example, one may say "drug decriminalisation would be beneficial on its own, but doing so would make my political party very unpopular, thus reducing the likelihood of being able to pass/block other, more significant legislation, therefore it has an overall negative effect".

What one usually really means when they say they support drug decriminalisation or that drugs should be decriminalised, is that anyone in power -- and that includes a voter -- should assign positive utility to the direct consequences of decriminalisation of drugs while making political decisions (such as voting). The reason this specific set of consequences (the direct ones) are often useful to distinguish is that the method one uses to evaluate these consequences is very different (and therefore the arguments actually made are very different) from the method one uses to evaluate the other consequences, such as political pragmatism -- an argument on these consequences can thus simply be a debate on the positive (in the sense of positive vs. normative) consequences of drug decriminalisation from some ethical premise.

The other consequences are then irrelevant to the debate of "Drugs should be decriminalised". This way, one has simplified our language and made communication a lot easier, even though the word "should" is fundamentally attached to the actions of an ethical actor, i.e. "I should... one should..."

Note, however, that it is still important to ensure that such imprecise language actually means something, i.e. can be converted into a precise form (we'll call this "normative language"), in terms of the actions of an ethical actor and the observations of a physical observer. Make this a practice in your own argumentation (you should!).

An important point to be mentioned is that your moral beliefs are fundamentally about your actions. As an example for why this is important -- consider the Non-Aggression Principle, which many libertarians cite as the ethical basis for their political beliefs. However, if your fundamental ethical principle is the Non-Aggression Principle, that wouldn't actually imply that you should be a libertarian, as it only means that you shouldn't violate anybody's rights, not that you should act politically to stop other people from violating people's rights. The ethical principle you should cite might be "minimise aggression", but "non-aggression" is clearly not it.

One often hears words like "rights" and "duties" thrown around. These do not seem to correspond to anything in our ethical vocabulary so far.

These terms have meaning within the structure of an ethical theory. For example, in a rights-based ethical theory, the word "right" is interpreted in the following way: "People have a right to free speech" = "One should not coercively suppress a person's speech."

The word "duty" seems to refer to moral obligation in general, but is usually used within the structure of an ethical theory which makes predictions discretely (more on what this means later).

These, which we call ethical objects are constructed to simplify the language of ethical arguments, but it is necessary to always keep in mind the real, fundamental meaning of the statement referring to an ethical object. It is always essential, for a statement to be meaningful, that a statement containing an ethical object can actually be written in normative language.

Finally, we discuss a few specifics:
  • Continuous and discrete ethical theories -- Some ethical theories provide moral prescriptions in every situation -- an example is utilitarianism. Others, like rights- and duties- based ethical theories only prescribe or unprescribe a specific action or discrete set of actions. E.g. any action is equally moral, as long as you don't violate other people's rights. Or -- any action is equally moral, as long as you do your duty.
    • The traditional bifurcation is "consequentialism vs. deontology". However, this is problematic. Ethics is fundamentally about making choices between actions, and saying that a certain action can be universally immoral requires the choice of a universal non-immoral incation to compare it against. To demonstrate that such an inaction does not exist, consider the following thought-moral-dilemma: you're driving a car with your foot on the accelerator when a child steps onto the road. If and only if you release the accelerator, the child won't die. Is it inaction to not move your foot or is it inaction to not continue to accelerate the car?
    • Furthermore, deontology and consequentialism, as usually defined, cannot actually be distinguished, as explained in the next point.
  • Utility functions -- A convenient universal formalism to express ethical theories in -- similar to the least-action formalism in physics -- is the "utility function". Essentially, this is stated in a principle similar to utilitarianism, with "maximise utility", except with any arbitrary definition of utility. Since all ethical theories can be stated in this formalism, a simple test to show that something is or isn't a meaningful ethical theory.
  • Resolving Buridan's ass -- Buridan's ass is a thought moral dilemma where a donkey is presented two symmetric (i.e. equal utility) ethical choices, and it can't choose -- this is often stated as a criticism of utilitarianism. However, there is no real paradox, as there are plenty of ethical theories, including the amoral "do whatever" and all discrete ethical theories, which do not always provide ethical prescriptions between two ethical choices. Indeed, the donkey should make its choice between the two randomly.

What about technology? What about art?
Technology and art are not knowledge -- they are not fundamentally statements, and it's baffling that some philosophers regard these as areas of knowledge. They are just "things", they are wealth, and they aren't knowledge any more than a potato or any other "thing" is knowledge.

What is religion?
Religion is not a relevant area of knowledge for discussion in philosophy, because it is a social set of beliefs which may be right or wrong, not something fundamental to knowledge itself. Religion is usually just a snapshot of the social knowledge of the past -- the inferior science of a few thousand years ago, the moral (including legal/political) systems of the time, etc.