This evening the fantastic Kamal and I sat down to learn a little more about databases than we did before.
I wanted to hack on SQLite, because I've used it before, it requires no configuration or separate server process, I'd been told that its source code is well-written and approachable, and all the data is stored in one file. Perfect!
To start out, I created a new database like this:
Just a primary key and a string! What could be simpler? I then wrote a
little Python script to put the contents of
Great! Now we have a 4MB database called
experimentation. My favorite first thing to do with binary files is to
cat them. That worked pretty well, but Kamal pointed out that of
hexdump is a better way to look at binary files. The output
hexdump -C fun.sqlite looks something like this:
I've pasted the first few thousand lines of the hexdump in this gist, so you can look more closely. You'll see that the file is alternately split into words and gibberish -- there will be a sequence of mostly words, and then unreadable nonsense.
Of course there's a rhyme to this reason! The wonderfully written File Format for SQLite Databases tells us that a SQLite database is split into pages, and that bytes 16 and 17 of our file are the page size.
fun.sqlite starts like this:
so our page size is
0x0400 bytes, or 1024 bytes, or 1k. So this
database is split into a bunch of 1k chunks called pages.
There's an index on the
id column of our
fun table, which lets us
run queries like
select * from fun where id = 100 quickly. To be a
bit more precise: to find row 100, we don't need to read every page,
we can just read a few pages. I've always understood indexes in a
pretty vague way -- I know that they're "some kind of tree", which
lets you access data in O(log n), and in particular that databases use
something called a btree. I still do not really know what a btree
is. Let's see if we can do any better!
Here's where it starts to get really fun! I downloaded the sqlite source code, and Kamal and I figured out how to get it to compile. (using nix, which is a totally other story)
Then I put in a print statement so that it would tell me every time it accesses a page. There's about 140,000 lines of SQLite source code, which is a bit intimidating!
It's also incredibly well commented, though, and includes adorable notes like this:
My next goal was to get SQLite to tell me how it was traversing the
pages. Some careful grepping of the 140,000 lines led us to this
btreePageFromDbPage. All page reads need to go through this
function, so we can just add some logging to it :)
Now it'll notify us every time it reads a page! NEAT! Let's experiment a little bit.
Those two rows (1 and 20) are in the same page, so it traverses the same path to get to both of them!
200 is pretty close in the tree, but it needs to go to
2818 instead at the end. And
80000 is much further away:
If we go back and inspect the file, we can see that pages 1, 5, 1198, 992, 2, and 1813 are interior nodes -- they have no data in them, just pointers to other pages. Pages 6, 2818, and 449 are leaf nodes, and they're where the data is.
I'm still not super clear on how exactly the interior pages are structured and how the pointers to their child pages work. It's time to sleep now, but perhaps that will happen another day!
Modifying open source programs to print out debug information to understand their internals better: SO FUN.