How Do We Know the Answer Is Not 43?

Note: This is a guest post at recomputation.org by two hyper-intelligent pandimensional little white mice,
Benjy Mouse and Frankie Mouse.

In all of time and space, the most famous abstract to a scientific paper is that written by
Fook, Loonquawl, Lunkwill and Phouchg:

“In this paper we show that The Answer to the ultimate question of life, the universe, and everything is 42. The Answer was found through
computational experimentation requiring 7.5×106 years (wallclock time) of the Deep Thought computer. As an interesting avenue for future research we propose investigations into what the ultimate question of life, the universe and everything actually is. We propose designs for a computer called Earth to calculate this question and estimate the required wallclock time for its computation at 107 years.”

In this post we ask the following simple question: How do we know The Answer is not 43?
To our great surprise, we have concluded that we have no way of knowing, due to lax practice by the original experimenters. We suggest that Fook, Loonquawl, Lunkwill and Phouchg’s paper should be retracted. And we report on actions we are taking to terminate an ongoing computational experiment that we now regard as worthless.

Saying that The Answer is not known is radical. More than 300,000 generations of schoolchildren have been taught by computer teaches that The Answer is 42. But we feel we must be open about this.
Without confidence that the answer is 42, we cannot understand The Question when it is found.

Some background is necessary on the elementary principles of reproducibility in science, and especially computational science.
We are in a fortunate position to study these questions. The computer Earth was indeed built following the designs laid out by Deep Thought. (Not the least of the scientific misdemeanours of Fook et al was not crediting Deep Thought with the design but claiming it as their own.) The Earth has been computing The Question for almost 10 million years, and we are the current directors of the research project. As is well known, organic beings form part of the Earth’s computational matrix, so we programmed a small sample of them to consider issues of reproducibility in computational science. For convenience we will refer to these entities by the variable names we used in our programming: by convention we use camelCase, named for the Arcturan megaCamel.

First we programmed karlPopper to investigate the general question of reproducibility. Amongst his conclusions was: “non-reproducible single occurrences are of no significance to science”. Clearly the 7.5 million year experiment producing The Answer was a single occurrence. The question is not whether it has been reproduced (it hasn’t), the question is whether or not it is reproducible in principle. Since it was a computational experiment we next programmed some agents to consider any special issues in that domain.

We found some interesting points in the logs of caroleGoble. She clarifies the meaning of
different words in this area, crediting also chrisDrummond and rogerDPeng.
She picks out the words repeat, replicate, reproduce and reuse. Finally,
a variable ianGent suggests the word recompute. We will examine each of these words in turn and how
it relates to The Answer.

  • Repeat: To repeat an experiment is to do the same experiment again using the same equipment. We have no report that Deep Thought was asked to calculate The Answer again: it seems reasonable to assume it wasn’t. Nor do we have any idea where Deep Thought is now, or whether it still exists. The Answer experiment cannot be repeated.

  • Replicate: To replicate an experiment is to do the same experiment in a different lab to the original, but usually following the methodology of the original as closely as possible. This does mean that methodological mistakes will be duplicated,
    leading to criticism of this method. Sadly,
    the methodological description of the original designers of Deep Thought 1.0 is somewhat underspecified.
    We quote it in full: “We turned it on and asked it to compute the answer to the ultimate question of life, the universe and everything.”
    This is somewhat short of the ideal, where a second group of scientists could use archived blueprints to build a new Deep Thought,
    as well as methodological details of running the experiment.
    One might argue that the scientific standards of an earlier age were laxer than now. However, this contention does not stand up when one considers that Deep Thought 1.0 was the second most powerful computer in time and space. One can only conclude that a certain lack of time pressure hit the authors: after all, with millions of years before the paper could be submitted, it might not have seemed very urgent to write that section on any particular day. It is a great pity that this was the case. But the result is that The Answer experiment can never be replicated.

  • Reproduce: To reproduce an experiment is to do the same experiment, but using a new experimental design. The more independent the experimental methodology is from the original, the better. This is a rich form of reproducibility because, ideally, it tests the underlying scientific question, rather than whether an experiment did what it said it did. The absence of the original blueprints is therefore not an obstacle to reproduction. Yet again, we are unable to find any record of The Answer experiment being reproduced. It may be possible to reproduce this experiment, and we propose to do so. Unfortunately, the lack of detailed reporting of the original experiment means that we will have to reinvent many techniques ourselves. This will cause us a slight delay, which we estimate at 2.5 million years.

  • Reuse: The most valuable element of reproducibility is reuse. In a sense this may not be reproducibility at all, because the goal is not to reproduce the same experiment. The idea is to reuse the scientific insights and results behind a given experiment. For example, if the Ultimate Answer is 42, perhaps we can create an experiment to test whether or not the Penultimate Answer is 41. A reuse is a rich form of reproducibility because it is the most likely to extend knowledge in new directions. Its only downside is that it is the least likely form to find mistakes in previous experiments. We are not aware of reuse of the Deep Thought experiment, except in the construction of the Earth.

  • Recompute: The word recompute has been used by ianGent to mean an even stricter form of replication, in which a virtualisation of an experiment is stored and can then be rerun at a later date. The advantage is that it is, in principle, the hard work of replication has been eliminated by virtualisation, making recomputation almost trivial (if time consuming in the case of The Answer.)
    Recomputations on Earth are so easy to describe that instructions can be given in less than 140 characters.
    The disadvantage is that if one simply does a recomputation, all one learns is that the result stated can be obtained again, or not.
    Naturally in the case of The Answer, this on its own would be a massive advance on the current status, but is impossible since Deep Thought was never virtualised.

However, the apparent poverty of recomputation is misleading. It has the potential for much greater riches. With a virtualised experiment one can also experiment from within. For example,
we have no idea what the code that Deep Thought ran was. One’s confidence in The Answer would be shattered if one was able to log in and see this code:

sleep 236677140000000
echo "The Answer to the Great Question ..."
echo "Of Life, the Universe and Everything ..." 
echo "Is ... " 
echo "Is ... " 
echo "Forty-two"

One can argue that Deep Thought reported that it had checked the answer very thoroughly. Unfortunately this gives us little added value,
since it remains possible that the program being run was

sleep 236677140000000
echo "The Answer to the Great Question ..."
echo "Of Life, the Universe and Everything ..." 
echo "Is ... " 
echo "Is ... " 
echo "Forty-two"
sleep 30
echo "I checked it very thoroughly and that quite definitely is the answer"

So we can see that recomputation, like the other 4 Rs, has advantages and would lead to greatly improved confidence in The Answer if it was available. Sadly, no form of reproduction has ever been performed on The Answer, and none is possible for (at our estimate) 2.5 million years. Given the extraordinary level of confidence one should require in a result of this magnitude, we have reluctantly come to the conclusion that the classic Fook et al paper should be retracted.

We now turn to the followup experiment which we have the privilege of being the current directors of, the Earth experiment to find The Ultimate Question. A major concern is that it was designed as the output of an irreproducible experiment. If Deep Thought was flawed in some way, perhaps the Earth is too? Further problems have arisen during execution.
For example, there have been random crashes. There was a known crash of the Golgafrincham Ark fleet ship B. Apart from minor surface damage, this also infected the Earth with rogue computational elements - the so called “Telephone Sanitisers” virus - and this may have
affected the results.
The evidence for this is the strange intermediate result obtained at that point, that the question was “What do you get when you multiply six by nine?” However as this was an intermediate result,
it may be that there was no error.

The original study has not been reproduced. The follow up study was designed based on the output of the first study, and suffered from the telephone sanitiser intervention, and possibly other faults.
We can conclude that the experiment currently being run on “Earth” will not be a meaningful scientific result. Although the experiment has been running for a little time
(just under 10 million years elapsed),
we felt there is no point wasting extra resources on the additional computation. Accordingly we have arranged with the Vogon Destructor Fleet
for the termination of the experiment five minutes before its completion.

The situation is greatly improved with the follow up experiment.
Due to the excellent digital curation practices of the Magrathean planet construction industry, we
are able to reconstruct a duplicate computer of Earth and have commissioned a recomputation of the Earth experiment. While this is a risky
approach (due to the unknown quality of the Earth designs), the longer computation time means that we conveniently have the 2.5 million years we require to reproduce the Deep Thought experiment
and still have it produce results at the same time.
There may be some minor variants in the computer design: for example there will be fjords in Africa instead of Norway.

We had thought that an experiment on the scale of Deep Thought or Earth would be impossible to run in an emulator. Fortunately, we have discovered that
Megadodo Publications has an emulator capable of
simulating the entire universe within a single office, so we hope to work with them to virtualise future large scale experiments.

We therefore can confirm that we plan to lodge our new Earth experiment
at recomputation.org. Of course we mean the intergalactic equivalent, since the version on Earth will be destroyed in a few minutes.