Unnameable real numbers

14th July 2010

I was a bit puzzled by one section of Gregory Chaitin's book Meta Math!, the one entitled "Reals are un-nameable with probability one!" (pp 113-4). Well, to tell the truth, I'm still puzzled by quite a lot of the book, but it seems to me that Chaitin might have made a lot more of the concept of "nameability" introduced in that part than he actually does.

Perhaps I should just set out an argument of my own. It looks sound to me, but I'm rather uncertain - if you find any logical problem in it, please let me know!

Let's begin though with Cantor's diagonal argument, not used by Chaitin, but discussed here in Wikipedia.

Cantor's diagonal argument shows that there are infinite sets which are not countable, that is, cannot be put into one-to-one correspondence with the set of natural numbers. First we imagine an infinite sequence of infinite sequences of 0's and 1's. Here is an example, a little different from the one given in the Wikipedia article:

s(1) = 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...

s(2) = 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...

s(3) = 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, ...

s(4) = 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, ...

s(5) = 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, ...

s(6) = 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, ...

s(7) = 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, ...

s(8) = 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, ...

s(9) = 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, ...

The principle behind the sequences is first to repeat the two possible strings of length one in 0 and 1, that is, s(1) and s(2), then the strings of length two, apart from 00 and 11, that is, 01 and 10 in s(3) and s(4), then the possible strings of length three apart from 000 and 111, namely 001, 010, 011, 100, 101, 110 and so on. The first element of s(1), the second element of s(2) and so on have been underlined above to illustrate the diagonal. We form a new sequence s(0) by making its first element different from the first element of s(1), its second different from the second element of s(2) and so on. In the example above, s(0) will thus begin 1, 0, 1, 1, 1, 1, 1, 1, 0. Now s(0) is different from any sequence in the list so far: it can't be the same as s(1), for its first element is different, it can't be the same as s(2) because its second element is different, and so on. The sequence of sequences may be infinitely long, but s(0) is nowhere on it.

Of course that is not very surprising, given the example. The regular sequences we've arranged will go on for ever and a day: it's easy to imagine any number of possible sequences that won't be on the list. Think of the sequence, for instance, that consists of 1 followed by two 0's, followed by another 1 followed by three 0's, and which carries on with 1's interspersed by prime numbers of 0's. That certainly won't be there, and it's not s(0) either!

As far as I'm aware, popular proofs at least, like the one in Wikipedia, just go on to suggest that no matter how you organise your list, no matter how cunningly you arrange it so that every imaginable sequence gets in, it still makes no difference. You can still form the diagonal, and that will still be a sequence not on the list. The sequence of infinite sequences of 0's and 1's cannot be put into a countable list. No matter how you do it, there will be a sequence left off, so there is the proof by reductio ad absurdum.

That should be totally convincing, yet - I don't know about you, but it's always left some tiny fragment of doubt in my mind. How on earth could you set about ensuring that every possible pattern or lack of pattern of 0's and 1's that you can think of finds its place in the list? That of course is why Chaitin disliked the lack of constructiveness in such proofs.

But maybe there is a way to alleviate the doubt without being particularly constructive. This idea is based on Cantor's way of enumerating fractions. Suppose instead of just one sequence of sequences, we have a whole lot of them like this:

s(1,1)    s(2,1)    s(3,1)     s(4,1)  . . . . .

s(1,2)    s(2,2)    s(3,2)     s(4,2)  . . . . .

s(1,3)    s(2,3)    s(3,3)     s(4,3)  . . . . .

s(1,4)    s(2,4)    s(3,4)     s(4,4)  . . . . .

Here we could imagine the first column, s(1,1) to s(1,4) and beyond as being like s(1) etc in the example before. The next column, s(2,1) and so on, could represent a series of sequences based on a completely different pattern, so that they were all different from those in the first column. The third column would again be a quite different sequence of sequences, and so on. There is space for an infinite number of different ways of organising sequences of 0's an 1's, with never a repetition. Plenty of room for the imagination!

Now that we have all those sequences, we arrange them for counting as follows. First comes s(1,1), then s(1,2). Those are followed by s(2,1) and s(3,1). After that, s(4,1), s(3,2), s(2,3) and s(1,4). The pattern is plain enough, and clearly all the sequences in each column and row will eventually be counted. As for the diagonal, its first element will be the first element of s(1,1), its second the second element of s(1,2), its third the third element of s(2,1) and so on. The reductio ad absurdum will follow just as before, but more convincingly, I hope. It certainly convinces me rather more.

It is pretty obvious that the argument so far is not restricted to binary sequences. If there are n different elements instead of just two, then there are just more ways of forming diagonals which are not the same as any sequence already on the list.

Real numbers, of course, are just sequences of digits, and if we just consider those between 0 and 1, we can just imagine sequences of digits with a dot to the left. Real numbers that we just normally think of as stopping, like .5 for 1/2, can be thought of as followed by an infinite series of 0's, so that we still have infinite sequences of elements. Just as we can think of 1/7 as .142857142857... and so on for ever, so 1/2 can be .5000000... and so on for ever. The only problem is that in terms of infinite sequences, .5 can also be expressed as .4999999..., with an infinite number of 9's. Clearly enough, any number ending in an infinite number of 0's has an equal number ending in an infinite number of 9's. That seems trivial enough, but it can cause a difficulty with the diagonal argument. It is essential for that argument that the diagonal provides us with a number that is different from any one on the list that has been, or could be, enumerated. If there is more than one way of expressing a number, we can no longer be certain that the diagonal number is different. Maybe it is just a different way of expressing one of those already on the list. So if we really want to show that real numbers are uncountable, a somewhat more subtle argument will be needed. Such an argument is hinted at in the Wikipedia article, but not explained so clearly.

The Wikipedia argument involves using ternary numbers, but there seems to be no need for those. Sticking to familiar decimal numbers, we could avoid endless sequences of 9's simply by avoiding the digit 9 altogether. We just try to enumerate sequences of infinite sequences of 0's up to 8's only. All the sequences are unique: there are still those ending in endless series of 0's, but there are none ending in 9's which could have the same value. The diagonal argument goes through, for however the enumeration is done, a diagonal sequence can be formed which is not equal to anything in the list. And of course the diagonal sequence does not have to include any 9's either. That proves that the real numbers are not countable, for even if the ones including 9's were added, that diagonal would still not be included, since it has no 9 in it.

Now the names of real numbers are certainly countable - they can easily be enumerated. The first step is to arrange the symbols used in writing one's language, including a space and any punctuation, in some kind of arbitrary order. There can only be a finite number of such symbols altogether, and there may well already be an alphabetical order for some of them. Mathematical symbols, at least enough for arithmetic and algebra, are also added to the list. So for English the list of single symbols might begin with [space] and end with, say, Σ:

[space], a, b, c, ... , z, 0, 1, 2, ... , 9, ... , ., ,, !, ... , +, -, ... , Σ
So then of course [space] is number 1, a is 2, and so on. Let us suppose that Σ is number N, rather than trying to count exactly. What to put in and what to leave out is a bit arbitrary anyway. I have included "!" in the list, and we might use "5!" to indicate "5 x 4 x 3 x 2 x 1" and so on, but it is not really necessary. On the other hand, an exclamation mark is certainly vital for writing mathematics in Chaitin style! Anyway, after all the single symbols, we can continue with all possible pairs, from N+1 to N2+N, then all the triples from N2+N+1 to N3+N2+N and so on ad infinitum:
[space][space], [space]a, ... , [space]Σ, a[space], aa, ... , ΣΣ, [space][space][space], [space][space]a, ... , ΣΣΣ, ...
Clearly all possible strings of letters and symbols will appear somewhere in this enumeration, and some of the strings will constitute proper grammatical English names for certain real numbers. It does not matter that we have not made special arrangements for suffixes and superfixes to appear together with symbols: we do not need "x4" because the string "x to the power of 4" will be available, and so on. The symbol Σ itself is not really necessary, since we can always look out for the string of symbols "the sum of" and so on instead. The same will go for other mathematical symbols that we might have added: they could be replaced by strings of English words instead.

But for sure all possible English real number names will be there somewhere. In fact some numbers will have several, maybe even infinitely many, different names on the list: "the square root of 2", "the square root of an even prime number", "√2", if we have included that square root symbol, "the reciprocal of the sine of forty five degrees", and so on. Some apparent names will not actually work for one reason or another: for instance, "the root of the equation x squared minus 5x plus 6 equals 0" is not a name because there are two roots, 2 and 3: but "the larger root of the equation x squared minus 5x plus 6 equals 0" would be all right as it uniquely identifies only the real number 3.

There will be all kinds of other mysterious things among this countable series of strings: real number names in code, names of real numbers as yet unknown to mathematicians, in the unlikely event that we try to sift through them all. But in spite of multiple names for the same number, mystery names which do refer to a number but which we can't understand, every intelligible string of symbols in the English language will have its place, including all recognisable number names as well as the complete works of Shakespeare and Agatha Christie.

And yet, as Chaitin points out (op. cit., p109, p113), this vast array of names could still pick out only an infinitesimal few out of the whole expanse of real numbers between 0 and 1. For suppose that we go through our list of strings of letters and symbols until we find the first one that names a real number between 0 and 1 in proper grammatical English. Perhaps it will be the symbol ".1". Then imagine the real numbers between 0 and 1 as a line, the interval between 0 and 1 on a metre rule perhaps. We put a tiny mark on the position of .1, or whatever our first number is, just ε/2 of a metre wide, where ε is as small as we can make it. Then we continue with the list until we find the second name of a real number between 0 and 1, then the third name and so on and on. The second number receives a mark just ε/4 of a metre wide, the third ε/8 and so on. If one of our marks goes over another, or beyond the 0 or 1 mark on the rule, we just have to start again with an even smaller ε until everything is squeezed in properly. No matter how many marks we make, however, their total width can never exceed ε(1/2 + 1/4 + 1/8 + ...) or just ε. So if ε is infinitesimally small, the total width of the marks is infinitesimal, even though the rule is a whole metre long. That can be interpreted as showing that all the names account for no more than an infinitesimal few of all the real numbers: only an infinitesimal few have names of any kind. That is what Chaitin means by "Reals are un-nameable with probability one!", though it might be more accurate to say that the chance of coming across a nameable real is infinitesimally small, which seems slightly different.

I'm still puzzled why Chaitin didn't put forward as bald an argument as that. Is there something wrong with it that I have missed? Or does Chaitin's writing style make him seem more cautious than he actually is?