Ordinals and well-founded recursion

 Two closely related questions about ordinals that I found quite confusing at first and couldn't find a satisfactory answer online (self-answering):

  • I've heard sentences like "$\omega^{CK}$ is the least ordinal that cannot be represented on a computer". What does this mean, precisely? Surely after exhausting notations for all computable ordinals, I could just represent $\omega^{CK}$ with something else? E.g. even in the Kleene's O notation, I could just denote $\omega^{CK}$ by $-1$ or something?
  • We say things like "the strength of a theory corresponds to the smallest recursive ordinal it cannot prove well-founded" -- but _we know_ every ordinal is well-founded, right? So are we better than every theory?



implementing the ordinals

The thing that really clarified this for me was to actually implement ordinals in Python. Implementations in Haskell are plentiful of course (e.g.) but Haskell doesn't feel as grounded, since you don't actually see the computations being done.

from typing import Union, Callable
from __future__ import annotations

class Ordinal:
  def __init__(self, children : None | Ordinal | Callable[[int], Ordinal]):
    self.children = children

That's it. An ordinal is defined by its "children" -- if that's None, it's 0; if that's a single ordinal, it's the successor of that ordinal; if that's an infinite sequence of ordinals, it's the limit of that sequence.

You can then define some obvious operations.

Zero = Ordinal(None)

def Succ(x : Ordinal) -> Ordinal:
  return Ordinal(x)

def Lim(xs : Callable[[int], 'Ordinal']) -> Ordinal:
  return Ordinal(xs)

def Add(x : Ordinal, y : Ordinal) -> Ordinal:
  if not y.children:
    return x
  elif type(y.children) == Ordinal:
    return Succ(Add(x, y.children))
  else:
    return Lim(lambda n : Add(x, y.children(n)))

def Mul(x : Ordinal, y : Ordinal) -> Ordinal:
  if not y.children:
    return Zero
  elif type(y.children) == Ordinal:
    return Add(Mul(x, y.children), x)
  else:
    return Lim(lambda n : Mul(x, y.children(n)))

def Pow(x : Ordinal, y : Ordinal) -> Ordinal:
  if not y.children:
    return Succ(Zero)
  elif type(y.children) == Ordinal:
    return Mul(Pow(x, y.children), x)
  else:
    return Lim(lambda n : Pow(x, y.children(n)))

And represent some small ordinals:

# ordinal corresponding to some finite n
def Fin(n : int) -> Ordinal:
  x = Zero
  for i in range(n):
    x = Succ(x)
  return x

Omega = Lim(Fin)

# ordinal corresponding to omega^n
def Foo(n : int) -> Ordinal:
  x = Succ(Zero)
  for i in range(n):
    x = Pow(Omega, x)
  return x

Epsilon0 = Lim(Foo)

explicitly seeing them as functions

What stops us from making an infinite sequence of all (computable) ordinals? That would be paradoxical, of course -- if you had an ordinal whose children/descendants included itself, its definition would not be a well-founded recursion, it would be circular. But it's easy to enumerate all programs, is it really so hard to enumerate specifically programs that represent objects of the Ordinal class?

Let's think of a way to reduce an ordinal into a simple-to-understand program. The natural way to do this is:

def call(x : Ordinal) -> Callable | None:
  if not x.children:
    return None
  elif type(x.children) == Ordinal:
    return lambda: call(x.children)
  else:
    return lambda n: call(x.children(n))

So e.g. call(Epsilon0) returns a multivariable function that you can call as follows:

$$\epsilon_0(2)=\omega^\omega$$

$$\omega^\omega(4) = \omega^4$$

$$\omega^4(5) = \omega^3\cdot 5$$ 

$$\omega^3\cdot 5(2) = \omega^3\cdot4+\omega^2\cdot2$$

$$\omega^3\cdot4+\omega^2\cdot2(4) = \omega^3\cdot4+\omega^2+\omega\cdot4$$

$$\omega^3\cdot4+\omega^2+\omega\cdot4(3)=\omega^3\cdot4+\omega^2+\omega\cdot3+3$$

$$\omega^3\cdot4+\omega^2+\omega\cdot3+3()()()=\omega^3\cdot4+\omega^2+\omega\cdot3$$

... and ultimately $\epsilon_0(2)(4)(5)(2)(4)(3)()()()(1)()(1)()(1)()(2)(2)()()(1)()(0)(0)(0)(1)(2)(3)()()()(2)()()=0$, which you can check by running

print(call(Epsilon0)(2)(4)(5)(2)(4)(3)()()()(1)()(1)()(1)()(2)(2)()()(1)()(0)(0)(0)(1)(2)(3)()()()(2)()())

You may think this is quite long, but I've actually chosen fairly small numbers at each step. The ridiculous thing is that despite these paths being absurdly long -- all of them do, in fact, terminate! This is a statement that should initially surprise you by the way -- you might think you can just provide larger and larger numbers at each step and get a non-terminating sequence, but nope.

$$\epsilon_0(2)(3)(4)(5)(6)()^6(7)()^7(8)()^8(9)()^9(10)(11)(12)()^{12}(13)()^{13}\dots(22)()^{22}(23)(24)^{24}(25)^{25}\dots(45)()^{45}(46)(47)(48)()^{48}(49)^{49}\dots(94)()^{94}=0$$

In general, the main point is that these sequences always terminate, i.e. the ordinals represent total functions. But given an arbitrary such function, it is not clear if it represents an ordinal. This is still a bit of a claim to unpack, at least to me, so it is helpful to think of ordinals from an entirely new perspective--


nested for loops

Remember the first time you actually bothered to implement recursion, because nested for loops wouldn't suffice? while loops would do it, of course, but often that's inelegant.

Generally speaking, if the only function we are allowed to define without recursion is the successor function (as it is the case with one nested `for` loop is sufficient to define addition $f_1(n)$, two to define multiplication $f_2(n)$, three to define exponentiation $f_3(n)$, and so on to define any hyperoperation.

And indeed this demonstrates that the hyperoperations partition (in terms of growth rate) all the primitive recursive functions -- but there are total computable functions that grow faster than any hyperoperation (i.e. than any primitive recursion), which you can define by diagonalization: i.e. $f_\omega(n)$, which uses $n$ for loops where $n$ is the value inputed (this is basically the "Ackermann function"), and you can go further and further. This is called the fast-growing hierarchy.

Quite generally such total computable functions can be written recursively, by Kleene's fixed-point theeorem (such a function is the "fixed point" of an operator like $\lambda (f, x) \mapsto xf(x-1)$ or whatever). What seems to be a good reference:

Fairtlough & Wainer (1992). Ordinal Complexity of Recursive Definitions.

TL;DR: Specifying an ordinal is equivalent to specifying a particular recursion. It is not immediately obvious that a given recursion is well-founded. ZFC can't actually prove "bla bla recursion equals bla bla big ordinal", even though it (non-constructively) "has" the latter ordinal.

No comments:

Post a Comment