Two Kinds of Bootstrapping

Summary: I like languages with a small core that is extensible. The languages tend to be weird and require less code to bootstrap.

I know of two ways to bootstrap a language.

The first way is probably more traditional. I'll call the first way Type

  1. In Type 1, you write a bare-minimum compiler for your language in a host language. So maybe you write a Lua compiler in C. Then you write a Lua compiler in Lua. Then you compile your compiler. Now you have a compiler, written in Lua. You can add to it and modify it without ever having to touch the C code again. You have the advantage of writing the features of your language (Lua) in a higher-level language (Lua). And finally, as you add features to your compiler, you can use those to add more features. There's some leverage.

I like the second way better. I'll call it Type 2. In Type 2, you write a small, powerful set of abstractions in the host language. For instance, you write an object system in C, a stack and dictionary in assembler, or lexical closures in Java. Then you write a compiler that targets those abstractions. If the abstractions are chosen correctly, your compiler is done. You can begin building abstraction on top of abstraction without touching the compiler.

There are a few things to note:

  1. Type 2 languages (Lisp, Smalltalk, FORTH) tend to be weird because they were birthed in a different way. The abstractions, though powerful, are often raw.

  2. Type 2 languages can be bootstrapped faster. The core is often much smaller than a full-featured compiler.

  3. Type 2 languages tend to require less code in general. I guess it's because you're writing most of it in a language that is compounding leverage.

  4. Type 2 languages are more easily ported, since all you have to do is rewrit e the core. Type 1 languages, depending on how they are built, can require you to re-bootstrap or write a cross-compiler.

In the end, I believe that both Type 1 and Type 2 are viable options for language-building. I prefer Type 2. If Type 2 intrigues you, you should learn Lisp (or FORTH or Smalltalk). I recommend the LispCast Introduction to Clojure videos course.