Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I'm writing a chapter for AOSA on something like this: a self-hosting compiler of a subset of Python to Python bytecode. It'll present the full code (about the same length) and try to explain it well for people not into compilers yet, but in the meantime I recommend the Wirth book.

What's your favorite shorter intro? I'd especially like to reference other educational compilers that are self-hosting.



sort by: page size:

I second the recommendation. Wirth's compiler book is a wonderful, concise and very readable introduction.

My favorite introductory compiler book is Niklaus Wirth's "Compiler Construction", which implements a compiler end-to-end with full source code, in a bit more than 100 pages: https://inf.ethz.ch/personal/wirth/CompilerConstruction/inde...

'Compiler Construction' by Niklaus Wirth[1] is a pretty good book too. It's got the hands-on feel of Crenshaw's book with a quick but, not too superficial, introduction to theory. The book is little more than 100 pages long.

The compiler is written in Oberon however, which is also the source language (actually the source language is Oberon-0, a subset) but, Oberon is super simple and can be learned on the go.

[1] https://www.inf.ethz.ch/personal/wirth/CompilerConstruction/...


The Wirth book gives a fairly quick overview that other compiler books can expand on. If you started with (say) the dragon book, you'll be wading through 200 pages about automata theory and parsing algorithms before you even get to anything else. Wirth provides a twenty-page overview, recommends starting with recursive descent parsing, and just moves on to the next issue.

I recommend starting with the Wirth book, then Andrew Appel's _Modern Compiler Implementation in ML_. And don't forget about the exercises!


I wrote a simple self-hosting compiler to showcase Niklaus Wirth's approach to writing compilers: https://github.com/lboasso/oberonc

Single pass compilers have their advantages.


As I wrote elsewhere, I use this tiny compiler as an introduction to the course, which goes into much more details and of course covers how to compile realistic programming languages (with multiple types of data, variables, conditional and loop control structures, functions, pointers, list and arbitrary structures, etc.) down to assembly code.

Thanks for all your interesting links :).


This is one thing I find funny about this book. Despite the name, it's a great intro to compilers in general.

If you're interested in more Wirth I recommend his book, Compiler Construction. It's a great small piece on building a language from scratch.

My first introduction to this subject was “Let’s Build a Compiler” by Jack Crenshaw. Even though it’s quite old now, I still highly recommend if you’re looking for a gentle, no-frills introduction: https://compilers.iecc.com/crenshaw/

I enjoyed Kyle James' "The Super Tiny Compiler."

https://github.com/jamiebuilds/the-super-tiny-compiler


Writing a toy compiler, the entire pipeline end-to-end, for a language no matter how small. I can't think of another exercise that covers so many good issues to solve, and good practices for solving them.

The two I like to recommend are:

- PLAI: https://cs.brown.edu/courses/cs173/2012/book/

- Essentials of Compilation: https://wphomes.soic.indiana.edu/jsiek/


Selected Excerpts:

"Write Your Own Compiler

The tiniest compiler book I have ever written. It covers all steps in the compilation process from high level language (T3X) to executable (ELF). Lots of diagrams, lots of code, zero theory. The tour through the compiler itself spans less than 100 pages. All code in the book is provided under the CC0 license (public domain). It can be downloaded below."

Practical Compiler Construction

This book offers a tour through the full compiler for a clean and sane subset of the C programming language (C89), covering lexical analysis, parsing, semantics, code generation, optimization, and runtime support, including lots of clarifying annotations and diagrams. The SubC compiler discussed in the book is in the public domain and can be downloaded below.

Kilo LISP, a kilo byte-sized LISP system

Kilo LISP is a purely symbolic LISP system that runs in 64K bytes of memory. It is written in C89 and compiles fine using SubC (below), Turbo C, or any modern C compiler. The code is about 25K bytes of comprehensible C and LISP.

The SubC Compiler

SubC is a clean, fast, and simple compiler for a subset of C89 that can compile itself on various BSDs, Linux, Windows, and other systems. It also cross-compiles to DOS. Its code is in the public domain.

The T3X Compilers

T3X is a small, portable, procedural, block-structured, recursive, almost typeless, and to some degree object-oriented programming language. It targets the Tcode machine, the 8086 under DOS, and the 386 and Alpha processors under Unix. T3X9 is a tiny, block-structured, procedural language. Its compiler can compile itself in the blink of an eye (0.05s on a 750MHz notebook). It currently generates ELF executables for FreeBSD-x86. T3X9 is a subset of T3X.

The XT3X Compiler and Virtual Machine

A T3X9 compiler and Tcode machine extended with functions for X11 graphics and OSS sound. Intended for writing simple video games as they were ubiquituos in the 1970's and 80's."

Disclaimer: I am slightly biased towards small compilers -- but, that being said, there's a ton of other fascinating things on this page...


On the related note, what would you guys recommend as an introduction to compilers, say for beginners?

This looks great! I’m taking a different approach than this compiler, but this looks like a good resource. Thanks!

Great article. I've been interested in compilers since my college class and this scratches the itch.

It's great to see this available and updated. It was one of my favorite compiler books growing up because it's so small and straight-forward. I feel it's a much better starting point for someone actually interested in building a compiler than say, the Dragon book.

Even more immediate and hands-on is Abdulaziz Ghuloum's "An Incremental Approach to Compiler Construction" paper: http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf (which chides Wirth for oversimplifying)


I'm very interested in learning to write compilers, starting with just a simple made up language.

Can anyone share resources? I prefer bound books, but anything is helpful! I've been reading the source of TempleOS for tips as well.


As very much a non-expert in compiler design, I found two books to be helpful introductory reads (neither are free though). I'm not sure whether someone proficient in the field would recommend them as my knowledge in the area is small, but I enjoyed them.

- Writing An Interpreter In Go by Thorsten Ball (https://interpreterbook.com)

- Compiling to Assembly from Scratch by Vladimir Keleshev (https://keleshev.com/compiling-to-assembly-from-scratch)

The first produces an interpreter in Go for a custom language, but as simscitizen says there's a lot of crossover with compilers.

The second creates a compiler written in Typescript that compiles from a minimal form of Typescript into ARM assembly. It's not cheap but is good.


It's very simple: Simplicity.

A Wirth-style compiler for a Wirth-style language can be written fairly rapidly by a single person, and the languages, compilers and runtime systems can be easily understood. The Oberon-07 language report, including the grammar, is 17 pages.

If simplicity doesn't matter to you, then this won't appeal to you, and that's fine. I mostly use Ruby these days, whose parser (in MRI anyway) is likely longer than the whole Oberon-07 language report alone (I haven't measured, but ca. Ruby 1.8.6 the parser alone was 6k lines+), so I've surrendered the simplicity, but it still stands out to me as an ideal and inspiration.

Wirth was perhaps too ascetic in his insistence on keeping the compilers minimal, but you can fine less extreme versions carried on in extensions done in the PhD dissertations of students in his group.

next

Legal | privacy