Julia Evans

How does SQLite work? Part 1: pages!

in databases

This evening the fantastic Kamal and I sat down to learn a little more about databases than we did before.

I wanted to hack on SQLite, because I’ve used it before, it requires no configuration or separate server process, I’d been told that its source code is well-written and approachable, and all the data is stored in one file. Perfect!

Strange Loop 2014

in conferences

I spent some of last week at Strange Loop. I met lots of new curious, friendly, wonderful people, gave a talk that people really liked, played board games with friends, and generally had a great time.

It was also kind of a big deal for me personally! A year and a half ago I gave my first lightning talk at a meetup, and was super scared. I’ve heard a lot about how great of a conference Strange Loop was, and I really wouldn’t have anticipated then that

  • I’d get to go
  • they’d accept a talk I submitted (“You can be a kernel hacker!”)
  • the room would be packed
  • people would say things like “that was my favorite talk of the conference”

I always imagine cool developer conferences like Strange Loop as being full of wizard developers who know everything about programming. I was worried that my talk would be too entry-level (I explain what a kernel is / what a system call is from the ground up, for instance).

It turned out tons of people there didn’t know a lot about systems programming / how kernels work, and my choice to make the talk accessible made a lot of people happy. So far I’ve given a bunch of talks and I’ve never regretted trying to make it easy to follow for as many people as possible.

You can be a kernel hacker!

in kernel

This blog post is adapted from a talk I gave at Strange Loop 2014 with the same title. Watch the video!

When I started Hacker School, I wanted to learn how the Linux kernel works. I’d been using Linux for ten years, but I still didn’t understand very well what my kernel did. While there, I found out that:

  • the Linux kernel source code isn’t all totally impossible to understand
  • kernel programming is not just for wizards, it can also be for me!
  • systems programming is REALLY INTERESTING
  • I could write toy kernel modules, for fun!
  • and, most surprisingly of all, all of this stuff was useful.

I hadn’t been doing low level programming at all – I’d written a little bit of C in university, and otherwise had been doing web development and machine learning. But it turned out that my newfound operating systems knowledge helped me solve regular programming tasks more easily.

I also now feel like if I were to be put on Survivor: fix a bug in my kernel’s USB driver, I’d stand a chance of not being immediately kicked off the island.

How is a binary executable organized? Let’s explore it!

I used to think that executables were totally impenetrable. I’d compile a C program, and then that was it! I had a Magical Binary Executable that I could no longer read.

It is not so! Executable file formats are regular file formats that you can understand. I’ll explain some simple tools to start! We’ll be working on Linux, with ELF binaries. (binaries are kind of the definition of platform-specific, so this is all platform-specific.) We’ll be using C, but you could just as easily look at output from any compiled language.

Let’s write a simple C program, hello.c:

1
2
3
4
5
#include <stdio.h>

int main() {
    printf("Penguin!\n");
}

Then we compile it (gcc -o hello hello.c), and we have a binary called hello. This originally seems impenetrable (how do we even binary?!), but let’s see how we can investigate it! We’re going to learn what symbols, sections, and segments are. At a high level:

  • symbols are like function names, and are used to answer “If I call printf and it’s defined somewhere else, how do I find it?”
  • symbols are organized into sections – code lives in one section (.text), and data in another (.data, .rodata)
  • sections are organized into segments

What happens if you write a TCP stack in Python?

in hackerschool, networking

During Hacker School, I wanted to understand networking better, and I decided to write a miniature TCP stack as part of that. I was much more comfortable with Python than C and I’d recently discovered the scapy networking library which made sending packets really easy.

So I started writing teeceepee!

The basic idea was

  1. open a raw network socket that lets me send TCP packets
  2. send a HTTP request to GET google.com
  3. get and parse a response
  4. celebrate!

I didn’t care much about proper error handling or anything; I just wanted to get one webpage and declare victory :)

Pair programming is amazing! Except… when it’s not.

in pairing

I wrote a blog post in March about why I find pair programming useful as a tool and why I enjoy it. There are entire companies like Pivotal that do pair programming 100% of the time, and they find it useful.

To get our terms straight, by “pair programming”, I mean “two people are trying to accomplish a task by sitting at a single computer together”.

Some people mentioned after I wrote that blog post that they disliked pair programming, sometimes strongly! Obviously these people aren’t wrong to not like it. So I asked people about their experiences:

People responded wonderfully. You can see about 160 thoughtful tweets about what people find hard or difficult in this Storify What do you find hard about pair programming?. I learned a ton, and my view that “pair programming is great and you totally should try it!!!” got tempered a little bit :)

Open sourced talks!

in talks

The wonderful Sumana Harihareshwara recently tweeted that she released her talk A few Python Tips as CC-BY. I thought this was a super cool idea!

After all, if you’ve put in a ton of work to put a talk or workshop together, it’s wonderful if other people can benefit from that as much as possible. And none of us have an unlimited amount of time to give talks.

Stephanie Sy, a developer in the Phillippines, emailed me recently to tell me that she used parts of my pandas cookbook to run a workshop. IN THE PHILIPPINES. How cool is that? She put her materials online, too!.

Ruby Rogues podcast: systems programming tricks!

If you listen to the Ruby Rogues podcast this week, you will find me! We talked about using systems programming tools (like strace) to debug your regular pedestrian code, building an operating system in Rust, but also other things I didn’t expect, like how asking stupid questions is an amazing way to learn.

Ruby Rogues also has a transcript of the entire episode, an index, and links to everything anyone referenced during the episode, including apparently 13 posts from this blog (!). I don’t even understand how this is possible, but apparently it is! It was a fun time, and apparently it is totally okay to spend a Ruby podcast discussing Rust, statistics, strace, and, well… not Ruby :)

Fun with stats: How big of a sample size do I need?

in statistics

[There’s a version of this post with calculations on nbviewer!]

I asked some people on Twitter what they wanted to understand about statistics, and someone asked:

“How do I decide how big of a sample size I need for an experiment?”

Flipping a coin

I’ll do my best to answer, but first let’s do an experiment! Let’s flip a coin ten times.

> flip_coin(10)
heads    7
tails    3

Oh man! 70% were heads! That’s a big difference.

NOPE. This was a random result! 10 as a sample size is way too small to decide that. What about 20?