Bootstrapping

Crystal is a cool new language that I like. One thing that’s neat about it is that even the compiler (as opposed to the runtime libraries) is written in Crystal. That is, the compiler is written in the same high level language that the compiler is intended to compile! As an aside, this is really powerful and important for the future of the language, because of the way that it enables OOP programmers to contribute to the language core. Compare with MRI Ruby, which is written in C, so that you have a divide between core developers who can follow what’s going on in the C code and everyone else.

Anyway, I intend to write more about Crystal later, but here’s the quick point. A version of Crystal that has the compiler written in Crystal has to be compiled… by Crystal. So the way this works is that Crystal version 0.9.0 is compiled by a precompiled binary of Crystal version 0.8.0. If you install crystal-lang from Brew you will see this happening: it pulls down a tar file of 0.8.0 in order to compile 0.9.0.

Obviously this leads to a chicken and egg regress. What compiled 0.8.x? 0.7.x., and so on. So where did 0.1.0 come from then? It came from Ruby! That is, Crystal’s compiler used to be written in Ruby, before Crystal learned how to bootstrap itself.

It might seem that the regress stops there, at the point before the Crystal bootstrap, but MRI Ruby’s C code has to be compiled by something as well, for example the gcc compiler. And how is a compiler like gcc compiled? It is bootstrapped in another regress that goes back to an assembler. Honestly I get fuzzy at this point, but my understanding is that at some point in the past you arrive at humans physically etching circuitry or manually feeding in punch cards to machines. That is, we have to start the sequence before you can get to programs that generate other programs and machines that build other machines. And of course humans bootstrap each other, down through the generations.

Anyways, I know I just recapitulated the concept of the historical regress, from which Aristotle inferred the existence of a Prime Mover. What’s striking for me though is how this account of bootstrapping conflicts with the dominant sense in computing of infinite, cheap, meaningless reproducibility. Every program and document is capable of being reproduced millions of times across the Internet. No particular bit or line of code residing on a particular computer is special. But bootstrapping shows how all code is indebted to a certain lineage, even as it is also potentially free from that legacy to develop in new directions. Well, all the parallels and analogies flow from there.