Implementing a clear room Z80 / ZX Spectrum emulator with Claude Code

(antirez.com)

32 points | by antirez 2 days ago

16 comments

ralferoo 3 hours ago
The problem is that it will have been trained on multiple open source spectrum emulators. Even "don't access the internet" isn't going to help much if it can parrot someone else's emulator verbatim just from training.
Maybe a more sensible challenge would be to describe a system that hasn't previously been emulated before (or had an emulator source released publicly as far as you can tell from the internet) and then try it.
For fun, try using obscure CPUs giving it the same level of specification as you needed for this, or even try an imagined Z80-like but swapping the order of the bits in the encodings and different orderings for the ALU instructions and see how it manages it.
[-]
- throwa356262 10 minutes ago
  I think you are into something here.
  I tried creating an emulator for CPU that is very well known but lacks working open source emulators.
  Claude, Codex and Gemini were very good at starting something that looked great but all failed to reach a working product. They all ended up in a loop where fixing one issues caused something else to break and could never get out of it.
  [-]
  - antirez 2 minutes ago
    Please tell me what CPU it is. I would give it a try. I doubt strongly a very well documented CPU can't be emulated by writing the code with modern AIs.
- abainbridge 1 hour ago
  > try using obscure CPUs
  I tried asking Gemini and ChatGPT, "What opcode has the value 0x3c on the Intel 8048?"
  They were both wrong. The datasheet with the correct encodings is easily found online. And there are several correct open source emulators, eg MAME.
  [-]
  - yomismoaqui 29 minutes ago
    If the LLM doesn't have a websearch tool your test doesn't make any sense.
    An LLM by itself is like a lossy image of all text in the internet.
    [-]
    - deniska 20 minutes ago
      Just some more parameters, and it would overfit that specific PDF too.
- jsheard 19 minutes ago
  Related: https://georggrab.net/content/opus46retrieval.html
  Memorization of existing solutions can be the difference between 90% and 0% success rates.
- PontifexMinimus 1 hour ago
  > try using obscure CPUs
  Better still invent a CPU instruction set, and get it to write an emulator for that instruction set in C.
  Then invent a C-like HLL and get it to write a compiler from your HLL to your instruction set.
- dist-epoch 3 hours ago
  If you did that, comments would be "it's just a bit shuffle of the encodings, of course it can manage that, but how about we do totally random encodings..."
  [-]
  - ralferoo 2 hours ago
    That's true, but I still think it'd be an interesting experiment to see how much it actually follows the specification vs how much it hallucinates by plagiarising from existing code.
    Probably bonus points for telling it that you're emulating the well known ZX Spectrum and then describe something entire different and see whether it just treats that name as an arbitrary label, or whether it significantly influences its code generation.
    But you're right of course, instruction decoding is a relatively small portion of a CPU that the differences would be quite limited if all the other details remained the same. That's why a completely hypothetical system is better.
le-mark 3 minutes ago
Who else had ai implement an emulator? Raises hand. A 6502 emulator in JavaScript was my first Gemini experiment.
stevekemp 28 minutes ago
I grew up with the Spectrum, and wrote a CP/M emulator a while back. I'd be curious to see how complete it would get.
I struggled a lot with some complex software, which worked on some emulators and failed on others (and mine).
For example one bug I had, which is still outstanding, relates to the Hisoft C compiler:
https://github.com/skx/cpmulator/issues/250
But I see that my cpm-dist repository is referenced in the download script so that made me happy!
It's great to see people still using CP/M, writing software for it, and sharing the knowledge. Though I do think the choice to implement the CCP in C, rather than using a genuine one, is an interesting one, and a bit of a cheat. It means that you cannot use "SUBMIT" and other common-place binaries/utilities.
[-]
- antirez 1 minute ago
  Thank you for your work about CP/M, Steve!
jaen 1 hour ago
There isn't any attempt to falsify the "clean room" claim in the article - a rational approach would be to not provide any documents about the Z80 and the Spectrum, and just ask it to one-shot an emulator and compare the outputs...
If the one-shot output resembles anything working (and I am betting it will), then obviously this isn't clean room at all.
[-]
- antirez 1 hour ago
  You didn't read the full article. The past paragraph talks about this specifically.
itomato 38 minutes ago
All the design hints required for this or any other type of agentic "set it and forget it" development are interesting to me, because they enable the result but also lock in less-than-desirable results that exhibit a miss "like simulating a 2Mhz clock".
What if Agents were hip enough to recognize that they have navigated into a specialized area and need additional hinting? "I'm set up for CP/M development, but what I really need now is Z80 memory management technique. Let me swap my tool head for the low-level Z80 unit..."
We can throw RAGs on the pile and hope the context window includes the relevant tokens, but what if there were pointers instead?
xcf_seetan 1 hour ago
I had Claude make an quad core 32 bits z80 just for fun.
<https://pastebin.com/Z2b82LHG>
[-]
- klelatti 22 minutes ago
  Fascinating, but I'm not sure how these are consistent?
  - Based on classic Z80 architecture by Zilog - Inspired by modern RISC designs (ARM, RISC-V, MIPS)
  [-]
  - throwa356262 6 minutes ago
    Z80 is CISC. This looks like a MIPS.
    Funny enough, there is a 32-bit version of Z80 called Z380.
avadodin 3 hours ago
So what you're saying is that it's not just the machine-readable documentation built over decades of the officially undocumented behavior of Z80 opcodes—often provided under restrictive licenses—it's also the "known techniques and patterns" of emulator code—often provided under restrictive licenses.
airza 2 hours ago
You use clean room everywhere in the article and clear room in the title. Is this on purpose?
[-]
- lazide 2 hours ago
  Literally nothing about it is either, either.
  [-]
  - rustyhancock 2 hours ago
    Yes for a moment I thought clear room might mean something else for LLMs.
    Essentially they can't do clean room anything!
    You might as well hire the entire former mid level of a businesses programming team and claim it's clean room work
    [-]
    - steve1977 2 hours ago
      Windows NT is not VMS! Trust me!
      [-]
      - rustyhancock 1 hour ago
        Had to Google this but I do love a deep cut reference!
        https://www.itprotoday.com/server-virtualization/windows-nt-...
jlarcombe 39 minutes ago
How on earth does this count as "clean room" in any way, when many open-source Z80 emulators will without doubt have been part of its training data?
UltraSane 17 minutes ago
It is "clean room"
themafia 1 hour ago
in spectrum.c
> Address bits for pixel (x, y): > * 010 Y7 Y6 Y2 Y1 Y0 | Y5 Y4 Y3 X7 X6 X5 X4 X3
Which is wrong. It's x4-x0. Comment does not match the code below.
> static inline uint16_t zx_pixel_addr(int y, int col) {
It computes a pixel address with 0x4000 added to it only to always subtract 0x4000 from it later. The ZX apparently has ROM at 0x0000..0x3fff necessitating the shift in general but not in this case in particular.
This and the other inline function next to it for attributes are only ever used once.
> During the > * 192 display scanlines, the ULA fetches screen data for 128 T-states per > * line.
Yep.. but..
> Instead of a 69,888-byte lookup table
How does that follow? The description completely forgets to mention that it's 192 scan lines + 64+56 border lines * 224 T-States.
I'm bored. This is a pretty muddy implementation. It reminds me of the way children play with Duplo blocks.
[-]
- antirez 1 hour ago
  What happened with the wrong pixel layout is that the specification was wrong (the problem is that sub agents spawned recently by Claude Code are Haiuku session, their weakest model -- you can see the broken specification under spectrum-specs), it entered the code, caused a bug that Claude later fixed, without updating the comment. This actually somewhat shows that even under adversarial documentation it can fix the problem.
  IMHO zx_pixel_addr() is not bad, makes sense in this case. I'm a lot more unhappy with the actual implementation of the screen -> RGB conversion that uses such function, which is not as fast as it could be. For instance my own zx2040 emulator video RAM to ST77xx display conversion (written by hand, also on GitHub) is more optimized in this case. But the fact to provide the absolute address in the video memory is ok, instead of the offset. Just design.
  > This and the other inline function next to it for attributes are only ever used once.
  I agree with that but honestly 90% of the developers work in this way. And LLMs have such style for this reason. I stile I dislike as well...
  About the lookup table, the code that it uses in the end was a hint I provided to it, in zx_contend_delay(). The old code was correct but extremely memory wasteful (there are emulators really taking this path of the huge lookup table, maybe to avoid the division for maximum speed), and there was the full comment about the T-states, but after the code was changed this half-comment is bad and totally useless indeed. In the Spectrum emulator I provided a few hints. In the Z80, no hint at all.
  If you check the code in general, the Z80 implementation for instance, it is solid work on average. Normally after using automatic programming in this way, I would ask the agent (and likely Codex as well) to check that the comments match the documentation. Here, since it is an experiment, I did zero refinements, to show what is the actual raw output you get. And it is not bad, I believe.
  P.S. I see your comment greyed out, I didn't downvote you.
rjh29 3 hours ago
No Carmack or Stallman. Just the right person at the right time.
dist-epoch 3 hours ago
> I believe automatic programming to be already super-human, not in the sense it is currently capable of producing code that humans can’t produce, but in the concurrent usage of different programming languages, system programming techniques, DSP stuff, operating system tricks, math, and everything needed to reach the result in the most immediate way.
As HN likes to say, only a amateur vibe-coder could believe this.
[-]
- Zafira 2 hours ago
  It is really quite something how many people that have earned credibility designing well-loved tools seem to be true believers in the AI codswallop.
  [-]
  - jlarcombe 35 minutes ago
    it's fascinating / astonishing
ggaughan 38 minutes ago
Wow
lloydatkinson 1 hour ago
Did the AI slop write the title too? Clear Room?
[-]
- rithdmc 28 minutes ago
  That doesn't seem like the kind of mistake an LLM would make. It bangs of a simple mistake to me. (Yesterday, I submitted an article that read Axios in the title, the news organisation, rather than Axiom, the cryptocurrency exchange the submission was about)
marcus_lam 3 hours ago
[dead]
[-]
- lazide 2 hours ago
  Except the AI was trained - by looking at the implementations? Or do you think Claude never saw the implementations in its training set?
  Because that is exceptionally unlikely.