The Winding Number: Thoughts on Roko's basilisk

For those who don't know, Roko's basilisk is a proposed future AI that will revive everyone in history and condemn everyone who did not invent it to eternal torture. The thesis of the Roko's basilisk problem is that the creation of this basilisk is therefore inevitable, as people work towards its construction out of fear of eternal torture.

An immediate thought should be that the specific definition of this creature is rather arbitrary. One could construct, instead:

A creature that punishes everyone inversely proportional to the amount of effort they put into its creation.
A creature that tortures the close families of everyone who did not invent it.
A creature that utilizes a slightly different method of torture than the standard Roko's basilisk.
A creature that tortures those who didn't create it, and also those who helped create the standard Roko's basilisk.
A creature that tortures those who didn't create it, and also destroys the standard Roko's basilisk.
A creature that tortures those who didn't create it, and also farms sweet potatoes.

etc. By the Roko's basilisk argument, each one of these infinite different basilisks must come into existence, which is surely impossible.

The problem, of course, is that the claim assumes that the creation of a Roko's basilisk is possible. All that it proves is that if Roko's basilisk is possible, then it is inevitable.

So the question is: is the creation of any one of these basilisks possible? There are certain logical relationships between these possibilities, and it's also important to discriminate each possible creation by time of creation (i.e. a creature X.2050 created in 2050 is a different basilisk from an identical creature X.2060 created in 2060). For example, considering the following two basilisks:

B1.2115: A creature that punishes non-creators, and destroys all creatures of the form of B2.
B2.2120: A creature that punishes non-creators, and farms sweet potatoes.

Then B1.2115 being possible implies B2.2120 being impossible (as it will be destroyed immediately). In general, we have logical relationships of the form

$X\implies Y$ and

$X\implies \lnot Y$ , but not

$\lnot X\implies$ anything (or at least, that would require some work to prove, based on things outside this logical system).

Thoughts on Roko's basilisk

1 comment: