Computable functions I: mu-recursion

When you were first introduced to functions, you were presented the idea of functions as processing machines, that took in output and processed it to give out output.

On a tangential note: even those who are educated in the "maps between sets" or "special subsets of the product set" often still think of functions in a way that reveals an implicit belief in "functions as a machine". E.g. noobs who are dismayed that there is no "closed-form expression" for the inverse of $y = x^x$ but are perfectly satisfied with the inverse of $y = x+7$, children and ancients who are uncomfortable with recurring decimals $0.\dot{3}$, $3.141\dots$. It is irrelevant to mathematics that you were taught an algorism [not sic] to compute "subtraction by 7" but not to compute "the Lambert W function", or that you are more familiar with representation in base 10 than in base 3 or base $\pi$.

This is the wrong way to think of a function -- a machine is not a function, it's an algorithm. There are two "ways" in which this distinction manifests:
  • A function may have multiple algorithms to compute it.
  • A function may have no algorithms to compute it.
The second might sound really unintuitive, but should really be completely expected.

In general, defining a function $\mathbb{N}\to\mathbb{N}$ means storing the value of every single $f(x)$ for each natural number $x$. Yes, some functions can do it without -- by just giving a rule that determine $f(n)$ for any given $n$ (i.e. an algorithm for $f$) -- but it is just obvious that this isn't possible for every function. 

Why?

Because ... because ... if we could, then we could store an infinite amount of data in finite memory!

(This suggests that there are functions we just can't write down or imagine -- exactly what this means will be formalized later, when we have the notion of models and languages.)



To formalize this more clearly, we need a precise notion of what exactly an algorithm is -- what exactly computation is.

Here's an example of a function we can actually write down: $f(x)=x^2$, which represents the function $(0, 1, 4, 9, \dots)$. Given any number $x$, we know what $f(x)$ is despite not having a table containing the values of $f(x)$ for every value of $x$.

(Take a moment to think about how amazing that is.)

How, though? Let's "unfold" the definition of $x^2$.

def square(x):
  return multiply(x, x)

def multiply(x, y):
  out = 0
  for i in range(y):
    out = add(out, x)
  return out

def add(x, y):
  out = x
  for i in range(y):
    out = successor(out)
  return out
Hm, interesting -- we have the notion of some basic functions (co-ordinate functions, constant functions, successor function) and some operators that work on them (successive lines of code, for loops, while loops) and free system generated by this is the set of all computable functions.

What about if conditionals, i.e. piecewise functions? Show that these can be written in terms of for loops. Hint: start with a basic function like $f(0)=0,f(s(n))=1$ and see how you can transform other conditionals into something similar?

More formally, we have the following primitive functions which are computable:

  1. Constant functions -- $c(\mathbf{x}):=c$
  2. Projection functions -- $p_i(\mathbf{x}):=x_i$
  3. Successor function -- $s(x)=x+1$

And the following operators:

  1. Composition, i.e. to successive lines of code.
    Given computable functions
    $$f_1:\mathbb{N}^n\to\mathbb{N}^m$$$$f_2:\mathbb{N}^m\to\mathbb{N}$$
    We have a computable function
    $$f:\mathbb{N}^n\to\mathbb{N}$$ Defined as: $$f:=f_2\circ f_1$$
  2. Bounded recursion, i.e. for loops.
    Given computable functions
    $$\text{“base”}\ \ f_0:\mathbb{N}^n\to\mathbb{N}$$$$\mathrm{“recursor”}\ \ h:\mathbb{N}^n\times\mathbb{N}\times\mathbb{N}\to\mathbb{N}$$
    ($h(\mathbf{x},y,z)$ takes as input iterate $y$ and output from previous step $z$)
    We have a computable function
    $$[h:f_0]:\mathbb{N}^n\times\mathbb{N}\to\mathbb{N}$$
    Defined by recursion:
    $$[h:f_0](\mathbf{x}, 0)=f_0(\mathbf{x})$$
    $$[h:f_0](\mathbf{x}, s(y))=h(\mathbf{x}, y, f(\mathbf{x}, y))$$
  3. Unbounded recursion, i.e. while loops.
    Given a computable function
    $$c:\mathbb{N}^n\to\mathbb{N}\to\mathbb{N}$$
    We have a computable function
    $$\mu(c):\mathbb{N}^n\to\mathbb{N}$$
    Defined as the lowest $y$ for which $\mu(\mathbf{x},y)=0$.

This is known as the structure of general recursive functions, or $\mu$-recursive functions, and is one of the formulations of the notion of a computable function.

We immediately see the existence of non-computable functions, since we see that computable functions are countable -- however, it doesn't really capture the idea of non-computable functions requiring an infinite amount of data to describe. 

An abstracted way to think about computer science -- and of machines and engineering in general -- is as a theory of automation. The mathematical abstraction of the machine is the algorithm, which is intuitively thought of as a set of instructions or computational paradigm. Pondering some examples of real-world machines help us pin down another characteristic of importance to an algorithm: the machines are often not completely autonomous, they need to take inputs, so algorithms can be considered procedures to execute functions.

Historically, the theory of machines was often formulated in terms of simple machines. The Ancient Greeks formulated the theory in terms of the "six simple machines" (the inclined plane, lever, wedge, wheel and axle, pulley, and screw); the Ancient Indians formulated the theory in terms of mechanisms that operate on "pressure, weight and rotation". Note that these were mechanistic descriptions (and not very fundamental ones, due to the lack of a serious theory of mechanics).

In general, these areas of machine theory fall into the general subject of control theory.

If I had to distinguish computer science from engineering/machine theory -- I would say that computer science is interested in the generality of machines. Computers are different from jackhammers, because the architecture of a computer allows the simulation of "any possible algorithm" that the jackhammer -- or any other machine -- could follow.

The computer is therefore fundamentally about control -- any kind of actions you may want to encode in a machine, a computer is literally the best thing you can ask for to do so, because it is the device that can produce any physically-possible-to-encode sequence of actions in the machine.



The Church-Turing thesis should therefore be seen as the fundamental thesis of computer science -- the precise specification of what machines/algorithms can do. It leads to the notion of Turing completeness, i.e. a particular machine actually acting as a universal computer (and a particular such instance is the universal Turing machine).

The Church-Turing thesis is the statement that the Turing machine -- or equivalently, mu-recursion, or lambda calculus -- can compute anything that can be computed at all. That it's the "best we can do", the ultimate limit (for which reason the thesis is sometimes known as the maximality thesis).

(Take a moment to appreciate how this means we've actually achieved optimization here. The computers we use are literally universal -- given enough memory and time. They are truly the pinnacle of human invention, to date.)

Note that this is a physical limitation, one that depends very specifically on the universe we live in. For example, a universe with time travel (the ability to send signals back in time) will be able to trivially solve the halting problem. The Church-Turing thesis is not a mathematical fact; it is a physical one. It can be considered an axiom of computer science, and if one day we do find a system that computes a larger set of computation -- called a hypercomputer -- then we'd find that some significant portion of computer science, such as the very notion of a Turing machine, is not relevant to our universe.

A particular result that will be helpful in convincing us of the thesis is that the mu, lambda and Turing notions of computability are indeed equivalent. This is important, because at least to me, lambda calculus and mu recursion are far more intuitive than the Turing machine; I can really see that mu recursion is what I'm doing when I'm executing an algorithm, and I can really see that lambda calculus is the "natural" sense in which one would try to describe functions if they can be described at all.

We will see this result in the next article.

No comments:

Post a Comment