Crafting Interpreters

(craftinginterpreters.com)

240 points | by tosh 23 days ago

15 comments

azhenley 23 days ago
The two most popular discussions of this fantastic book:
2020 with 777 points: https://news.ycombinator.com/item?id=22788738
2024 with 607 points: https://news.ycombinator.com/item?id=40950235
goodthink 23 days ago
Reading this book brought me a better understanding of "the expression problem" and the use of the visitor pattern as its solution. This led me to (finally) grok the use of Class _Heirarchy_ Inheritance[0] as a solution not requiring visitors. In Newspeak[1], classes can contain nested classes, so when you subclass a class, you inherit the nested classes as well. This blog post discusses the same feature affording Free Object Algebras [2].
[0] https://blog.bracha.org/primordialsoup.html?snapshot=Amplefo... [1]https://newspeaklanguage.org [2]https://blog.bracha.org/primordialsoup.html?snapshot=Amplefo...
chrysoprace 23 days ago
I've found this book to be a good way to learn a new language, because it forces you to do a bit of reading about various language features and patterns to create equivalent implementations. For languages that lack some of the features in Java, it can be tricky to learn how to apply similar patterns, but that's half the fun (for me).
incognito124 22 days ago
I just went through this book during the winter holidays. I just love the author's casual writing style and all the tiny jokes and puns they made.
I hope we get to see "Add a type checker to Lox" sequel
keyle 23 days ago
It's a great book, I bought the paper version first, but man it was too big and heavy for my liking, ended up buying a digital copy; much more practical for notes and search...
although I keep getting lost somewhere in the mountain :)
I also recommend munificent's other book about game programming patterns. Both are fun to read.
[-]
- flir 23 days ago
  Sometimes I get the spine guillotined off and replaced with a ring binding. Any print shop can do it for you, and you just lose the gutter plus a little margin. Easier to work with at a desk, and you can even split into two "books" if you feel it necessary.
  But that's only for books I don't want to keep, and Crafting Interpreters is definitely a keeper...
  [-]
  - keyle 23 days ago
    Interesting idea. Thanks.
acedTrex 23 days ago
I have bought the print version of this 3 seperate times to give as a gift, its excellent.
[-]
- munificent 22 days ago
  Thank you for buying copies! :D
  [-]
  - acedTrex 18 days ago
    Thank you for writing an awesome book!
Nora23 23 days ago
One of the best resources for learning compiler design. The web version being free is incredibly generous.
[-]
- fuzztester 23 days ago
  Compiler doesn't match the title of the book.
  [-]
  - roetlich 23 days ago
    Well, most interpreters use a compiler internally, for example to compile to byte code. The book explains that as well, so I'd recommend just reading it: https://craftinginterpreters.com/a-map-of-the-territory.html...
    [-]
    - fuzztester 22 days ago
      Thanks.
  - Me1000 22 days ago
    You should read the book! :)
    [-]
    - fuzztester 22 days ago
      I had started but not come to that point yet. Mea culpa.
  - fuzztester 22 days ago
    Ignorant HN downvoting monkeyboys at it again. Caint call em men, hee hee, ho ho.
papercrane 22 days ago
I love this book! I do wish there was a new edition that updated the version of Java used in the tree-walk interpreter. There's been some additions to the language, like sealed classes and exhaustive switches, that could really benefit the implementation.
[-]
- bbaron63 22 days ago
  It's a fun little exercise left to the reader to upgrade to current Java. It pretty much eliminates the need for his ad-hoc code generation tool.
  [-]
  - wduquette 22 days ago
    Been there, did that, very much enjoy the result.
stevefan1999 23 days ago
Really I would love to know how parse context sensitive stuff like typedef which will have "switched" syntax for some tokens. Would like to know things like "hoisting" in C++, where you can you the class and struct after the code inside the function too, but I just find it hard to describe them in rigorous formal language and grammar.
Hacky solution for PEG such as adding a context stack requires careful management of the entry/exit point, but the more fundamental problem is that you still can't "switch" syntax, or you have to add all possible syntax combination depending on the numbers of such stacks. I believe persistent data structure and transactional data structure would help but I just couldn't find a formalism for that.
[-]
- remexre 23 days ago
  https://en.wikipedia.org/wiki/Lexer_hack
  Make your parser call back into your lexer, so it can pass state to it; make the set of type names available to it.
- luksenburg 22 days ago
  Another possible solution is the usage of functional parsers (e.g.: [0]) and making use of some form of the ‘do’ notation. Each step makes its result available to all subsequent parsers.
  [0] https://hackage.haskell.org/package/parsec
- torginus 22 days ago
  C/C++ has one of the worst-designed syntaxes, its such a shame that entire families of the most popular languages ended up copying the same mistakes.
  I know it's no solace to you, but Rust and Go don't even have this problem Afaik, and it's avoidable by careful consideration.
  [-]
  - suspended_state 22 days ago
    I don't really know what you mean by "worst-designed syntax". Do you mean that the design process was bad, or that the result is bad?
    [-]
    - torginus 22 days ago
      I meant exactly what the parent-comment pointed out - that C can't be parsed without a symbol table. Like the example on wikipedia:
      A * B;
      Which either represents a multiplication or a pointer of type A* to B, depending what the symbol table looks like. That means parsing C is impossible without these hacks, and you need to basically parse the whole file to build up this information.
      A lot text editors which only support grammars can't parse this properly, there are a ton of buggy C parsers in the wild.
      The issues that led to this were completely avoidable, and many languages like Pascal (which was more or less its contemporary), Go or Rust did avoid them. They don't lead to the language being more powerful or expressive.
      Calling it the 'worst' might be hyperbole, but given how influential C-style syntax has become, and how much C code is out there, these issue have affected a ton of other languages downstream.
      [-]
      - suspended_state 20 days ago
        So you were criticizing the C language syntax, without considering the context which it was designed in.
        Just to give this context a little bit more substance, Pascal was designed to work on a mainframe which could address up to 4MB of RAM, with a typical setup of around 1MB (it's actually not the real amounts: the CDC-6600 the values are 128Kwords, but it had 60 bits word). These machine were beasts designed for scientific computation.
        The first C compiler was implemented on a PDP-11, which could handle up to 64KB of RAM, and had 16bits words.
        I assume that these constraints had a heavy influence on how each language was designed and implemented.
        Note that I wasn't aware of all these details before writing this comment, I had to check.
        See: http://pascal.hansotten.com/niklaus-wirth/zurich-pascal-comp...
        Regarding the C compiler, it is likely that the first version was written in assembly language, which was later "translated" to C.
        An early version of the compiler can be found there: https://github.com/theunafraid/first_c_compiler and does look like assembly hand converted to (early) C.
wduquette 22 days ago
Simply my favorite programming text of all time.
codr7 22 days ago
I'll just drop this here for those looking to get started on interpreters:
https://github.com/codr7/shi
And perhaps this for those leaning towards C:
https://github.com/codr7/hacktical-c
kunley 23 days ago
Part of a 2nd half of this book translated to Go became a skeleton for the BCL configuration language https://github.com/wkhere/bcl
jokoon 23 days ago
I stopped reading when he started using the visitor pattern
[-]
- ceronman 23 days ago
  The visitor pattern is very common in programming language implementations. I've seen it in the Rust compiler, in the Java Compiler, in the Go compiler and in the Roslyn C# compiler. Also used extensively in JetBrains' IDEs.
  What do you have against this pattern? Or what is a better alternative?
  [-]
  - high_na_euv 23 days ago
    Visitor is heavy of code pattern that can be replaced by elegant, readable switch with exhaustive check, so all operations available by "Kind" enum are covered.
    [-]
    - wiseowise 22 days ago
      This wasn't available in Javs at the time. You're free to rewrite it with pattern matching (like the book, quite literally, leaves as an exercise for the reader).
    - ceronman 22 days ago
      A switch or pattern matching approach is useful, but not practical for some cases. For example, there are cases where you are interested in only a single kind of node in the three, for those cases the Visitor pattern is very helpful, while doing pattern matching is cumbersome because you have to match and check almost every node kind. That's why, for example, the Rust compiler still uses the visitor pattern for certain things, and pattern matching for others.
    - torginus 22 days ago
      Roslyn has visitor pattern combined with the 'Kind' enumeration you mentioned. You can either choose to visiti a SyntaxNode of a certain type, or override the generic version and decide what you want to do based on that enumeration.
      [-]
      - high_na_euv 22 days ago
        C# doesnt have exhaustive switch over enums.
        It needs to get "closed enum" lang. feature.
        [-]
        torginus 22 days ago
        Exhaustive enums (or type switches) are not a requirement, and are infact harmful - imagine if they add a new kind of syntax node to the language, now your analyzer no longer compiles unless you add a default case - which is very easy to add in C# as well.
        [-]
        high_na_euv 22 days ago
        Unless you add default... or handle such case, as expected.
        Ofc you can use this feature wrong and abuse default case, but in general this is very good since it prevents you about missing places to add handling and screams at you at comp time instead of runtimr
        neonsunset 22 days ago
        [dead]
    - wffurr 22 days ago
      Exhaustive switch with tail-calling makes for a very fast and readable interpreter.
- kevthecoder 23 days ago
  The bytecode interpreter in the second half of the book doesn't use the visitor pattern.
  [-]
  - HarHarVeryFunny 23 days ago
    No, but his first "Tree-walk Interpreter" does - he builds an AST then uses the visitor pattern to interpret it.
    https://craftinginterpreters.com/representing-code.html#work...
    [-]
    - etyp 22 days ago
      To quote the very first paragraph of the bytecode interpreter section[1]:
      > The style of interpretation it uses—walking the AST directly—is good enough for some real-world uses, but leaves a lot to be desired for a general-purpose scripting language.
      Sometimes it's useful to teach progressively, using techniques that were used more often and aren't as much anymore, rather than firehosing a low-level bytecode at people.
      [1] https://craftinginterpreters.com/a-bytecode-virtual-machine....
      [-]
      - HarHarVeryFunny 22 days ago
        Sure, I'm not criticizing it.
        He's doesn't actually build on this though, but rather goes back to a single pass compiler (no AST, no visitor) for his bytecode compiler.
  - jokoon 23 days ago
    the parser does
    [-]
    - ceronman 23 days ago
      The parsers in crafting interpreters do not use the visitor pattern. The visitor pattern is used when you already have a tree structure or similar. The parser is what gives you such tree structure, the AST. When you have this structure, you typically use the visitor pattern to process it for semantic analysis, code generation, etc.
    - tonyedgecombe 23 days ago
      I’ve only glanced at the second part but I don’t remember that being the case.
- volemo 23 days ago
  What’s bad about the visitor pattern? /gen
  [-]
  - cfors 23 days ago
    https://grugbrain.dev/
    grug very elated find big brain developer Bob Nystrom redeem the big brain tribe and write excellent book on recursive descent: Crafting Interpreters
    book available online free, but grug highly recommend all interested grugs purchase book on general principle, provide much big brain advice and grug love book very much except visitor pattern (trap!)
    Grug says bad.
    In all seriousness, the rough argument is that it's a "big brain" way of thinking. It sounds great on paper, but is often times not the easiest machinery to have to manage when there are simpler options (e.g. just add a method).
    [-]
    - not-a-juggler 22 days ago
      https://news.ycombinator.com/item?id=44304648
      Grug doesn't elaborate much, but here's the author's take in slightly more detail.
- fuzztester 23 days ago
  Why?
rohitpaulk 23 days ago
In case anyone finds it useful, we (CodeCrafters) built a coding challenge as a companion to this book. The official repository for the book made this very easy to do since it has tests for each individual chapter.
Link: https://app.codecrafters.io/courses/interpreter/overview
[-]
- mi_lk 23 days ago
  Not sure why this ad (access needs paid membership) is the top comment
raymond_goo 23 days ago
Crafting Interpreters is the one thing that LLM's can do really really well. Because it is so easy to define and test.
Here are is a new LUA interpreter implemented in Python:
https://github.com/rhulha/MoonPie
And here is a new language:
https://github.com/rhulha/EasyScript
[-]
- nicoburns 23 days ago
  > Because it is so easy to define and test
  Probably also because there 100+ implementations for it to copy from
- wiseowise 22 days ago
  LLMs can write much better comments than you do, but for some reason you continue to write them. Why?
- ramon156 23 days ago
  What even is the point of that? The whole point of the book is to get a sense and mindset of crafting compilers.
- HarHarVeryFunny 23 days ago
  Is MoonPie your project? Have you written up anything about your experience and process of creating it?