Day 9: Bytecode is made of bytes! CPython isn't scary!
Today I paired with one of the fantastic Hacker School facilitators, Allison on fixing some bugs in a bytecode interpreter. byterun is a pure python interpreter for the bytecode that CPython generates, written for learning & fun times.
Allison has a
great blog post
about how to use the dis
module to look at
the bytecode for a function which you should totally read.
A few things I learned
The CPython interpreter is mostly in one 3,500 file called ceval.c
(see it on github!). The main part of this file is a 2,000-line switch statement – switch(opcode) {...
. Ack.
But! This file is surprisingly not-scary. Or Allison is just amazing at making
things seem not scary. So for example there’s a BINARY_SUBTRACT
opcode
which, well, subtracts things.
Here’s the actual for serious C code that handles this:
TARGET(BINARY_SUBTRACT) {
PyObject *right = POP();
PyObject *left = TOP();
PyObject *diff = PyNumber_Subtract(left, right);
Py_DECREF(right);
Py_DECREF(left);
SET_TOP(diff);
if (diff == NULL)
goto error;
DISPATCH();
}
{:lang=‘c’}
So, what does this do?
- Get the arguments off the stack
- Subtract them by looking up
left.__sub__(right)
- Decrease the number of references to
left
andright
for garbage collection reasons - Put the result on the stack
- If
__add__
doesn’t return anything, throw an exception DISPATCH()
, which basically just means “go to the next instruction”
I could TOTALLY WRITE THAT.
We spent some time reading the C code that deals with exception handling in
Python. It was pretty confusing, but I learned that you can do raise ValueError from Exception
to set the cause of an exception.
Basically the lesson here is
- Allison is the best. Pairing with her on byterun is the most fun thing
- It’s actually possible to read the C code that runs Python!
- Bytecode is made of bytes. Like, there are less than 256 instructions and each one is a byte. I did not realize this until today. Laugh all you want =D