New playground: memory spy
Hello! Today we’re releasing a new playground called “memory spy”. It lets you run C programs and see how their variables are represented in memory. It’s designed to be accessible to folks who don’t know C – it comes with bunch of extremely simple example C programs that you can poke at. Here’s the link:
This is a companion to the “how integers and floats work” zine we’ve been working on, so the goal is mostly to look at how number types (integers and floats) are represented.
why spy on memory?
How computers actually represent variables can seem kind of abstract, so I wanted to make it easy for folks to see how a real computer actually represents variables in memory.
why is it useful to look at C?
You might be wondering – I don’t write C! Why should I care how C programs represent variables in memory?
In this playground I’m mostly interested in showing people how integers and floats are represented. And low-level languages generally all represent integers and floats in the same way – a 32-bit unsigned int is going to be the same in C, C++, Rust, Go, Swift, etc. The exact name of the type is different, but the representation is the same.
In higher-level languages like Python it’s a little different, but under the
hood a float
in Python contains a C double
, so the C representation is
still pretty relevant.
you don’t have to know C
It uses C because C is the language where it’s the most straightforward to map between “the code in your program” and “what’s in your computer’s memory”.
But if you’re not comfortable with C, this playground is still for you! We put together a bunch of example programs where you can run them and look at each variable’s value.
None of the example programs use any fancy features of C – a lot of the code
is extremely simple, like char byte = 'a';
. So you should be mostly
able to understand what’s going on even if you don’t know C at all.
how does it work?
Behind the scenes, there’s a server that:
- compiles the program with
clang
- runs the program with the C debugger
lldb
(using a Python lldb script) - returns a JSON file with the values of the variable on every line, as an array of bytes
Then the frontend formats the array of bytes so you can look at it. The display logic isn’t very fancy – ultimately it’s a pretty thin wrapper around lldb.
some limitations
The two main limitations I can think of right now are:
- there’s no support for loops (it’ll run them, but it’ll only tell you the value of the variable the first time through the loop)
- it only supports defining one variable per line
There are probably more, it’s a very simple project.
the inspiration
Python Tutor by Philip Guo was a huge inspiration. It has a different focus – it also lets you step through programs in a debugger, but it’s more focused on helping the user build a mental model for how variables and control flow work.
what about security?
In general my approach to running arbitrary untrusted code is 20% sandboxing and 80% making sure that it’s an extremely low value attack target so it’s not worth trying to break in.
Programs are terminated after 1 second of runtime, they run in a container with no network access, and the machine they’re running on has no sensitive data on it and a very small CPU.
some notes on the tech stack
The backend is in Go, plus a Python script to script the interactions with lldb. (here’s the source for the lldb script and the source for the Go server right now). I’m using bubblewrap to sandbox lldb.
As always the frontend is using Vue. You can see the frontend source with “view source” if you want.
The main fancy thing that happens on the frontend is that I use tree sitter to figure out which lines of the code have variables defined on them.
some design notes
As usual these days, I built this project with Marie Claire LeBlanc Flanagan. I think the design decision I’m the happiest with is how we handled navigating the program you’re running. Instead of using next/previous arrows to step through the code one line at a time, you can just click on a line to view its variables.
This “click on a line” design wouldn’t make sense in a normal debugger context because usually you have loops and a line might be run more than once. But our focus here isn’t on control flow, and none of the example programs have loops.
The other thing I’m happy with is the decision to use regular links like (<a href="#example=hexadecimal">
) for all the navigation. There’s an
onhashchange
Javascript event that takes care of making sure we update the
page to match the new URL.
I think there were more design struggles but I forget what they were right now.
that’s all!
Here’s the link again:
Let me know on Twitter or Mastodon if you notice any problems.