My holiday break project this year was to build an NES emulator in Python, sort of. Somewhere between recently beating the NandGame, feeling nostalgia after seeing the Pokemon Red Experiments, and getting a new graphics card, I felt like building an NES emulator in Python that I could try to train an AI to play.
So, I humbly introduce my latest project…
I started by building a multithreaded prototype in pure Python, using pygame to draw frames – it was unsurprisingly slow. I ported it to Cython, but it spent all its time thrashing on the Python GIL, the main thread starving the emulator threads. I refactored that implementation to be single-threaded as clock synchronization across threads is complex, but the performance was simply not good enough.
Instead, I decided to rewrite it from scratch in pure C and expose it to Python with extensions. It’s a single-threaded tick approach that fits cleanly into render loops, with no globals so multiple emulators can be run in threads. It links with NumPy’s C API to expose rendered frames as byte arrays representing RGB values for each pixel. The pytendo C APIs are wrapped in a Pythonic API, which can then be consumed by a user-playable app, a debugger, reinforcement learning, anything really.
Here’s what the system looks like:
The core of the emulator, the CPU, the PPU, and (eventually) the APU are implemented in C. The C API exposes a Python extension module that wraps the internal emulator functions for ticking, retrieving a rendered frame, and so on. The Python API is split between normal use and special functions for introspecting an emulator. Consumers of the Python API can use it to blit the NumPy array to a drawn canvas such as pygame or Dear PyGUI, or start normalizing it for training in PyTorch.
So far, I’ve implemented roughly half of the MOS 6502 instruction set, the very basics of background drawing (no sprites) and CHR ROM in the PPU, and a simple debugger. Audio and input are still a ways off.
Using printf
only goes so far when you’re trying to figure out what’s happening inside the emulator, so pretty early
on I started building a debugger. It uses a Dear PyGUI window, draws rendered frames to a texture in the window, and
has a handful of controls and panes to the side to inspect state during a running ROM.
Here’s a quick demo with a branch timing test rom:
It loads the ROM in a paused state (and is rendering junk data). I set a breakpoint
at a random instruction, and press the Run
button. It ticks away at the emulator until the PC
register points
to the address of the breakpoint I set. It breaks, and I can step through some lines and watch the registers change.
Press Run
once more and it will proceed until it hits the breakpoint again. I remove the breakpoint, run it again,
and it spins on an infinite loop that I can interrupt.
There’s obviously a lot more to do, but the basic skeleton of the system is up and running. I need to round out the 6502 instruction set, run through more and more test ROMs to fill in functionality and emulate edge cases, expand the debugger so it’s net useful, and plenty more.
You can find all the code on GitHub. I’ll be writing more on this project as time constraints allow. Finally, a special thanks to the folks at NesDev for compiling such a rich knowledge base on how the NES works!