# Boltzmann RAM

Posted on by Chris Warburton

Simulating every computable universe (the constructivist/intuitionistic version of Tegmark’s level 4) is actually remarkably easy. Each of those universes can be simulated by running some computer program (by definition of constrictivism), so we can simulate all of them by “just” running every possible computer program!

## A simple existence proof

There’s actually a single, very simple, program which will simulate all others, by diagonalising over the program lengths and running times:

• Run all programs of length $1$ (i.e. programs `0` and `1`) for $1$ step each.
• Re-run all programs of length $1$, this time for $2$ steps. Then run programs of length $2$ (`00`, `01`, `10` and `11`) for $1$ step each.
• Re-run programs of length $1$ for $3$ steps; programs of length $2$ for $2$ steps; and run all programs of length $3$ (`000`, `001`, `010`, etc.) for $1$ step each.
• Keep going in this manner forever: each time re-running all previous programs for an extra step, then running all one-bit-longer programs for $1$ step.

If we wait long enough, this loop will eventually run any program for any number of steps: including simulations of our universe being run for long enough to reach the present day!

## A remarkable linear-time solution

The above algorithm is exponentially slow: each iteration runs twice as many programs as the last (since there are ${2}^{l}$ programs of length $l$), so reaching step $n+1$ of any program will take at least twice as long as reaching step $n$. That’s a lot of overhead, which will soon grind everything to a crawl, and makes this whole endeavour seem far-fetched.

However, Schmidhuber suggests we can avoid this in two ways:

• Firstly we can ‘spread out’ the programs by writing them in a prefix-free code. This requires more bits to encode our programs, reducing the number of programs of each length; i.e. there are fewer than ${2}^{l}$ programs of length $l$ when encoded in a prefix-free way.
• Then, instead of re-running with an extra step each time, we instead re-run with double the number of steps, i.e. running for $1$ step, then $2$ steps, then $4$ steps, then $8$ steps, etc.

This doubling has an interesting effect: the original algorithm took twice as long to reach step $n+1$ as to reach step $n$; yet this algorithm can reach step $2n$: that’s linear time! In other words, nothing will slow down over time: it’s as if we were running each program on a dedicated computer, yet we’re actually running all programs on a single machine!

The reason this works is that our changes (prefix-free encoding and exponentially-distributed running times) cause the allocation of runtime between programs to obey the Kraft inequality. Each program gets a constant fraction of our runtime (equivalent to running on a dedicated, albeit slower, machine). The downside is those fractions get very small for larger programs: those of length $1$ will get $\frac{1}{2}$ the runtime, programs of length $2$ will get $\frac{1}{4}$ of the runtime, and in general programs of length $l$ will get $\frac{1}{{2}^{l}}$ of the runtime. The use of a prefix-free code ensures that there are few-enough programs of each length to avoid these fractions adding up to more than $1$.

Finally, the overall speed of each program is only half of its allocated runtime, so a program of (prefix-free encoded) length $10$ will get $\frac{1}{{2}^{10}}=\frac{1}{1024}$ of the runtime, but only run at $\frac{1}{2048}$ of its normal speed. The reason is that we keep restarting everything. As an example, for a program to reach step $100$ of its execution it must be restarted many times, which wastes some of its allocated runtime: first it wastes $1$ step, then is later restarted and run for $2$ steps (making $3$ wasted steps so far); then $4$ steps ($7$ wasted so far); and so on. Eventually it will be run for $128$ steps, which is enough for it to reach step $100$ that we wanted. At that point it’s wasted $1+2+4+8+16+32+64=127$ steps; and in general, when we re-run a program for $S$ steps, it means we’ve already wasted (one fewer than) $S$ steps on it so far. Since we’re thus spending (just less than) twice as many steps to reach each point of a program’s execution, this restarting is hence causing every program to run at around half the speed it otherwise would.

Whilst these linear slowdowns of each program are exponential in their length; they are nevertheless constant over time. Hence if we start this program running, it will (gradually) spin-up a simulation of every possible universe simultaneously (albeit most will be running at an extremely slow rate).

## Implications

The “simulation hypothesis” has recently entered popular culture as a serious idea (building on previous, more fanciful incarnations like The Matrix). It asks whether our Universe was purposefully created as a simulation, inside and of some other “host” Universe (perhaps to investigate historical or counterfactual scenarios). That’s a nice philosophical idea, akin to thought experiments like the classic brain in a jar, but hence just as self-defeating: they are predicated on our inability to discern the difference between their postulated worlds, so they therefore make no discernable difference to anything either way!

In contrast, any simulations run by the above algorithms are not purposeful: if run for long enough, they will simulate our Universe, and countless others; not because we were specially chosen, but precisely because we are not fundamentally different from any other computer program! Since there’s no postulated “host”, to select and manipulate the rules, we can treat this setup in a scientific way; more akin to a Boltzmann brain.

The idea of a Boltzmann brain is that random movements in some collection of matter, like a gas cloud, may occasionally bring their consistents together in a way which is capable of thought. The usual argument against this as an explanation for our observations is that smaller arrangements of matter are exponentially more likely to occur by chance than larger arrangements. Hence, if we were such a Boltzmann brain, we would expect to see very little structure or order around us; yet we observe a vast Universe filled with structure.

### Boltzmann RAM

Rather than applying the Boltzmann brain idea to the structure that constitutes us, we can instead apply it to a structure that computes us; for example, by flipping random bits on a Turing machine tape. Again, it’s unlikely that an algorithm will arise by chance for simulating precisely us, or our Universe, due to how complex it is to describe: it’s exponentially more likely for smaller, simpler programs to arise.

However, we’ve seen above that there are small, simple programs that do simulate our Universe (albeit not precisely, since they also run every other Universe alongside). Such programs may well arise by chance, given a reasonable number of bit flips. In that case, we need to ask how likely our observations about the Universe would be, if there were so many simulations running at once. Since “disorder” takes more bits to encode, and the above algorithms allocate less runtime to longer programs, we would expect most minds to observe an ordered, structured Universe (so long as there’s enough complexity for such minds to arise at all)!

## Conclusion

Note that this hypothesis does not postulate some ‘higher level’ Universe that runs the computation of ours; it could instead be the case that such computation is just the underlying nature of reality. As an analogy, General Relativity assumes curved spacetime is the underlying nature of reality, that’s why Earth orbits the Sun, and the motion can be calculated using tensor algebra. Yet that does not imply there must be some intelligent beings performing those calculations in order for the motion to occur: it just happens, all on its own. Likewise, it may be the case that computations “just happen”, and we can use complexity arguments to make falsifiable predictions about what’s more or less likely to happen.

To his credit, Schmidhuber followed this line of reasoning to arrive at a prediction that large-scale quantum computation will not be possible: precisely because the computational effort required to describe that behaviour would make Universes containing them very unlikely, compared to Universes which constrain quantum effects to more easily-calculated realms (perhaps taking shortcuts to arrive at classical solutions on the macro-scale). Whilst I doubt that will be the case, it’s certainly a novel perspective on cosmology; and yet another justification for trying to scale up our quantum computers!