Julia Evans

Hacker School alumna

If you like this, you may like Ulia Ea.

Debug your programs like they’re closed source!

Until very recently, if I was debugging a program, I practically always did one of these three things:

  1. open a debugger
  2. look at the source code
  3. insert some print statements

I’ve started sometimes debugging a new way. With this method, I don’t look at the source code, don’t edit the source code, and don’t use a debugger. I don’t even need to have the program’s source available to me!

Can we repeat that again? I can look at the internal behavior of closed-source programs.

How?!?! AM I A WIZARD? Nope. SYSTEM CALLS! What is a system call? Operating systems know how to open files, display things to the screen, start processes, and all kinds of things. Programs can ask their operating system to do these things, using functions called system calls.

System calls are the API for your computer, so you don’t have to know how a network card works to send a HTTP request.

Here’s a list of the system calls Linux 2.2 provides, to give you a sense for what’s available. There’s exit, open, read, write, time, mount, kill, and all kinds of other things. System calls are basically the definition of platform specific (different operating system have different system calls), so we’re only going to be talking about Linux here.

How can we use these to debug? Here are a few of my favorite system calls!

open

open opens files. Every time any program opens a file it needs to use the open system call. There’s no other way.

So! Let’s say you have a backup program on your computer, and you want to know which files it’s working on. And that it doesn’t show you a progress bar or have any options. Let’s say that it has PID 60.

We can spy on this program with a tool called strace and print out every file it opens! strace shows you which system calls a program calls. To spy on our backup program, we would run strace -e trace=open -p 60, to tell it to print all the open system calls from PID 60.

For example, I ran strace -e trace=open ssh and here were some of the things I found:

open("/etc/ssh/ssh_config", O_RDONLY)   = 3
open("/home/bork/.ssh/config", O_RDONLY) = 3
open("/home/bork/.ssh/id_dsa", O_RDONLY) = 4
open("/home/bork/.ssh/id_dsa.pub", O_RDONLY) = 4
open("/home/bork/.ssh/id_rsa", O_RDONLY) = 4
open("/home/bork/.ssh/id_rsa.pub", O_RDONLY) = 4
open("/home/bork/.ssh/known_hosts", O_RDONLY) = 4

This makes total sense! ssh needs to read my private and public keys, my local ssh config, and the global ssh config. Neat! open is super simple and super useful.

execve

execve starts programs. All programs. There’s no way to start a program except to use execve. We can use strace to spy on execve calls too!

For example! I was trying to understand a Ruby script that was basically just running some ssh commands. I could have read the Ruby code! But I really just wanted to know which damn command it was running! I did this by running strace -f -s3000 -e trace=execve and read zero code!

The -f option is super important here. It also tracks the system calls of every subprocess! I basically use -f all the time. Use -f. ([longer blog post about using strace + execve to poke at Ruby programs]).

write

write writes to files. I think there are ways to write to a file without using write (like by using mmap), but usually if a file is being written to, it’s using write.

If I strace -e trace=write on an ssh session, this is some of what I see:

write(3, "SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1.1\r\n", 41) = 41
[...]
write(5, "[jvns /home/public]$ ", 21)   = 21
write(3, "\242\227e\376\344\36\270\343\331\307\231\332\373\273\324\303X\n<\241p`\212\21\317\353`\1/\3629\273m\23\17\26\304\fJ\352z\210\2\210\211~7W", 48) = 48
write(5, "logout\r\n", 8)               = 8
write(3, "b\277\306\16!\6J\202\tF$\241\32\302\3\0\23\310\346f\241\233\263\254\325\351z\222\234\224\270\231", 32) = 32
write(3, "\311\372\353\273\233oU\226~\373N\227\323*S\263\307\272\204VzO \10\2\316\224\335X@Hj\26\366\271J:i6\311\240A\325\331\341\220\1%\233\240\23n\23\242\34\277\2139\376\31j\255\32h", 64) = 64
write(2, "Connection to ssh.phx.nearlyfreespeech.net closed.\r\n", 52) = 52

So it opens an SSH connection, writes a prompt to my terminal, sends some (encrypted!) data over the connection, and prints that the connection is closed! Neat! I understand a bit more about how ssh works now!

/proc

I want to talk about one more Linux thing, and it isn’t a system call. It’s a directory called /proc! There are a million things that /proc does, but this is my favorite:

/proc tells you every file your process has open. All of them! For example, one of my Chrome processes has PID 3823. If I run ls -l /proc/3823/fd/*, it shows me all the files Chrome has open!

fd stands for “file descriptor”.

$ ls -l /proc/3823/fd/*
total 0
lr-x------ 1 bork bork 64 Apr 19 09:28 0 -> /dev/null
l-wx------ 1 bork bork 64 Apr 19 09:28 1 -> /dev/null
lrwx------ 1 bork bork 64 Apr 19 09:28 10 -> socket:[16583]
lr-x------ 1 bork bork 64 Apr 19 09:28 100 -> /opt/google/chrome/nacl_irt_x86_64.nexe
lrwx------ 1 bork bork 64 Apr 19 09:28 101 -> /home/bork/.config/google-chrome/Default/Application Cache/Cache/index
lrwx------ 1 bork bork 64 Apr 19 09:28 102 -> /home/bork/.config/google-chrome/Default/Application Cache/Cache/data_0
lrwx------ 1 bork bork 64 Apr 19 09:28 103 -> socket:[178726]
lrwx------ 1 bork bork 64 Apr 19 09:28 104 -> socket:[21064]
lrwx------ 1 bork bork 64 Apr 19 09:28 105 -> /home/bork/.config/google-chrome/Default/Application Cache/Cache/data_1
lrwx------ 1 bork bork 64 Apr 19 09:28 106 -> /home/bork/.config/google-chrome/Default/Application Cache/Cache/data_2
lrwx------ 1 bork bork 64 Apr 19 09:28 107 -> /home/bork/.config/google-chrome/Default/Application Cache/Cache/data_3

aaaand a million more. This is great. There are also a ton more things in /proc/3823. Look around! I wrote a bit more about /proc in Recovering files using /proc (and spying, too!).

ltrace: beyond system calls!

Lots of things happen outside of the kernel. Like string comparisons! I don’t need a network card for that! What if we wanted to know about those? strace won’t help us at all. But ltrace will!

Let’s try running ltrace killall firefox. We see a bunch of things like this:

fopen("/proc/10578/stat", "r")                               => 0x11984f0
free(0x011984d0)
fscanf(0x11984f0, 0x403fe7, 0x7fff09984980, 0x7f2fc7cd4728, 0)
fclose(0x11984f0)
strcmp("firefox", "kworker/u:0")

So! We’ve just learned that killall works by opening a file in /proc (wheeee!), finding what its name is, and seeing if it’s the same as “firefox”. That makes sense!

When are these tools useful?

These systems-level debugging tools are only appropriate sometimes. If you’re writing a graph traversal algorithm and it has a logical error, knowing which files it opened won’t help you at all!

Here are some examples of times when using systems tools might make your life easier:

  • Is your program running a command, but the wrong one? Look at execve!
  • Your program communicates with something on a network, but some of the information it’s sending is wrong? It’s probably sending it with write, sendto, or send.
  • Your program writes to a file, but you don’t know what file it’s writing to? Use /proc to see what files it has open, or look at what it’s writeing. /proc doesn’t lie.

At first debugging this way is confusing, but once you’re familiar with the tools it can actually be faster, because you don’t have to worry about getting the wrong information! And you feel like a WIZARD.

Learn your operating system instead of a new debugger

There are all kinds of programming-language-specific debugging tools you can use. gdb! pry! pdb! And you should! But you probably switch languages more often than you switch OSes. So, learning your OS in depth and then using it as a debugging tool is likely a better investment of your time than learning a language-specific debugging tool.

If you want to know which files a process has open, it doesn’t matter if that program was originally written in C++ or Python or Java or Haskell. The only way for a program to open a file on Linux is with the open system call. If you learn your operating system, you acquire superpowers. You can debug programs that are binary-only and closed source. You can use the same tools to debug no matter which language you’re writing.

And my favorite thing about these methods is that your OS won’t lie to you. The only way to run a program is with the execve system call. There aren’t other ways. So if you really want to know what command got run, use strace. See exactly which parameters get passed to execve. You’ll know exactly what happened.

Further reading

Try Greg Price’s excellent blog post Strace – The Sysadmin’s Microscope. I have an ever-growing collection of blog posts about strace, too!

My favorite way to learn more, honestly, is to just strace random programs and see what I find out. It’s a great way to spend a rainy Sunday afternoon! =)

Thanks to Lindsey Kuper and Dan Luu for reading a draft of this :)

Have fun!

♥ PyCon

PyCon 2014 happened! It was my first time at PyCon, I expected to have a good time, and it was better than I expected. I spoke! People came up to me and said they enjoyed my talk! There were so many amazing talks! I met so many people whose work I’d been following! 1/3 of the talks were by women! It was wonderful.

A few talks that especially stood out to me, or that I missed and really want to see. I didn’t see anything like all the talks, but I liked these. In no particular order:

Not all of the videos are up yet, but I’ll come back to this later and put in video links when they are.

  • Titus Brown gave a wonderful talk about his work in computationally intensive biology. I found this particularly interesting because he offered roughly “I have harder data problems that your tech job does! Come do a PhD with me and I’ll pay you a fraction as much.” This was oddly compelling. Very much worth watching. Notes from his talk, [Video]
  • Julie Pagano gave advice about battling imposter syndrome. I liked that her advice was practical! We need better advice than “you shouldn’t have imposter syndrome!”, and this did well at that. [Video]
  • Jessica McKellar’s keynote on how we can help advance computer science education was amazing. It was amazing because she gave so many concrete suggestions and specific calls to action: there were easy things (for example: call your legislators and tell them CS should count for AP math/science credit!) and larger commitments. She challenged everyone to do just one thing in the upcoming year to try to make CS education in high school better. [Video]
  • Fernando Perez spoke about the state of Python for scientific and how scientists are using IPython to easily make their work reproducible. I’m so impressed with the community he’s building around this software. The tools are so good and getting better so quickly. [Video]
  • Naomi Ceder spoke about being a trans woman in the Python community. I saw so many positive comments about her talk on Twitter afterwards. I’m really interested to see what she has to say, and delighted that I work in a community where her perspective is valued. [Video]
  • Paul Tagliamonte’s talk about compiling Lisp to Python bytecode was the kind of excited “let’s see how far we can take this crazy idea!” talk that I really enjoy. Super enjoyable speaker. [Video]
  • Julie Lavoie talked about Analyzing Rap Lyrics with Python. I liked this because she clearly loves rap, and she gave some background on rap as an art form, including samples from different styles. Also it was a super fun introduction to natural language processing. [Video]
  • Tavish Armstrong talked about one of his favorite topics: how programmers can learn from software engineering research. His call to action: Try to measure something about your software engineering practice! Show it to your friends! Reproduce it! Give a talk at PyCon next year about it! [Video]
  • Allison Kaptur explained how import works in Python, from the ground up. I loved that she started with a naive version of import and kept incrementally improving it until we got to a version that resembles how import actually works. [Video]
  • I talked about why IPython Notebook and Pandas are my favorite tools for exploratory data analysis, and people said they enjoyed how enthusiastic I was afterwards. Yay! [Video], [Slides], [pandas cookbook]

A few more talks that I want to watch the videos for, but can’t comment on because, well, I haven’t yet.

I never stop being impressed with people I meet at PyCon. Conferences are so hard! I want to meet all the people and do all the things and be in 3 places at things. And the sprints haven’t even happened yet!

I’m so thankful to all the organizers for doing so much work to make this possible. The conference chair Diana Clarke got a standing ovation at the closing session, and more than deserved it.

♥ PyCon.

Becoming a better developer (it’s not just writing more programs)

I asked on Twitter today something I’d been talking to at lot of friends about – how does someone become a senior developer? How do I know what I should be practicing? What qualities should I be looking for in mentors?

The best answer I got was this blog post: On Being a Senior Engineer, by John Allspaw at Etsy. I am mostly writing this so I can remember go back and read that repeatedly. It talks about

  • taking and seeking out criticism and
  • non-technical skills and
  • estimates (eeep! so hard!) and
  • doing tedious and boring work and
  • raising up the people around you (“generosity of spirit”) and
  • making tradeoffs explicit when making judgements and
  • empathy and
  • cognitive bias and
  • so much more

But you should just go read it.

I’ve also been thinking about this tweet by @seriouspony:

and how it’s easy to get stuck in a rut and keep doing the things you’re comfortable with. I’d like to not do that.

A few other things people linked me to that were interesting:

“Ask if you have questions” isn’t enough

I’ve helped out and taught at a few programming workshops for beginners now, and I’ve noticed something. There are always helpers who have tons of experience and are super willing to answer questions. And there’s always an annoucement where someone says “Here are the helpers! They are here just to help you. Raise your hand if you have questions!”.

And it’s not enough. Inevitably people will not ask because they’re scared or they think it’s their fault and then they’ll get stuck.

Here are some things I try to do when I help out at workshops.

  • Circulate! Ask people “hey are you okay? Everything working well?”
  • If everybody’s saying yes, some of them are not okay and just not telling you. Ask more questions! “How are you finding it so far?”. If you talk to someone for a minute sometimes they’ll come out with “well everything is fine but this weird thing is happening… can you take a look?”
  • Watch out for confused faces and sad people! Go talk to people with confused faces and try to see how they’re doing!
  • Keep asking the same people if they’re okay over and over again. If you keep talking to them, they’ll be much more likely to ask you when they do have a problem.
  • When someone raises their hand I have a lot of dialogues like this:
    • Hey I have a ques-
    • GREAT. I LOVE QUESTIONS. WHAT IS YOUR QUESTION
    • question question
    • That’s definitely something we can solve! Oh that’s actually a mistake in the directions! That’s totally our fault! Sorry about that!
  • I really do like answering programming questions, so I act super excited about it. I think/hope this makes it easier for people to ask me because they don’t feel like it’s a burden.
  • A script: I love questions! We can solve that for sure! Lots of other people are having that problem! You’re doing great! I’m happy you’re here!

It feels weird to ask the same people if they’re doing okay over and over and over and over and over again, but I think it really does help.

This feels more effective and like a better use of my time as a helper than standing in the corner waiting for someone to flag me down. And more fun!

Reports from remote-land: remote pairing works great!

I’ve been working remote for 2 weeks now. The things that have surprised me the most are

  • I’m less lonely and disconnected than I expected (it turns out I’m an extrovert…)
  • how well remote pairing works!

I did a lot of pair programming while at Hacker School and I found it to be a really productive way to work. When I got a remote job (I live im Montreal and work with people mostly in SF), I thought it would be impossible to do pairing, or at the very least it would be a terrible experience.

This is not so! Here’s what I’ve been using:

  • tmate for sharing my terminal (I use vim in the terminal). The other person only needs a ssh client to use tmate.
  • Google Hangouts or Skype for talking, and sometimes for screensharing
  • an internal alias http://go/julia that redirects to a Google hangout to make it extra easy to talk to me
  • a lot of scheduling pairing dates with Google calendar

My experience so far with Google Hangouts / Skype is that neither one is really better, and both of them sometimes don’t work. I tried appear.in once and that didn’t work at all. One person’s sound and video was 2 minutes behind the entire time.

tmate mostly works beautifully. Sometimes it will work perfectly and it’s amazing and sometimes it will freeze for me 5 times in an hour. I’m not sure if it doesn’t like inconsistent internet connections or what. The other problem with tmate is that some people have security concerns with it, so I’m thinking of making an internal tmate that goes through our servers.

I’ve also been using screensharing to remote pair. Screensharing is great because it means you can use any editor you want (not just a terminal editor), you can use more than one window, and nobody needs any special software. Both Skype and Google Hangouts have it and they’re both a little wonky. A lot of people have retina Macbooks and their screen resolution is way higher than mine, so I have to ask them to zoom in a lot. The bigger you make the text, the less you’ll have trouble when the internet connection wavers.

If you’re screensharing, be proactive about asking your partner to zoom in :)

The worst thing about remote pairing is that it’s really disruptive to the flow of the pairing session when your internet connection keeps dropping or if one of your tools freezes, and that never happens with in-person pairing.

As long as the technology works, though, remote pairing has been great! I find that it feels just as productive than in-person pairing, which surprised me a lot. The only thing I wish for is an extra screen so I could see my pairing-partner’s face at the same time.

Are you doing remote pairing? I’d love ideas of how to mitigate any of the problems I’m having :) I’m @bork on Twitter.

Recovering files using /proc (and spying, too!)

I’ve had a vague idea for years that /proc was a way the Linux kernel exposed its internals, and that I could look there to find things.

Then I learned this:

Suddenly it was like /proc was turned into a magical unicorn! I can use it to recover my files?! ★★Amazing★★.

Let’s explain why this works. When a process opens a file (including sockets), it gets a file descriptor for that file, which is a number starting at 0.

File descriptors and investigations on std{in,out,err}

0, 1, and 2 are always the stdin, stdout, and stderr of the process. For example, if I look at the file descriptors for a Google Chrome process I have, I see:

$ ls /proc/4076/fd
0  10  12  14  16  18  2   21  23  26  28  3   31  34  36  38  4   41  43  5   6  72  8
1  11  13  15  17  19  20  22  25  27  29  30  32  35  37  39  40  42  44  53  7  74  9

That’s pretty opaque! Let’s take a closer look.

$ ls -l /proc/4076/fd/{0,1,2}
lr-x------ 1 bork bork 64 Mar 22 22:38 /proc/4076/fd/0 -> /dev/null
l-wx------ 1 bork bork 64 Mar 22 22:38 /proc/4076/fd/1 -> /dev/null
l-wx------ 1 bork bork 64 Mar 22 22:38 /proc/4076/fd/2 -> /home/bork/.xsession-errors

Neat, the numbers 0, 1, and 2 are just symbolic links! It looks like Chrome doesn’t have any stdin or stdout, which makes sense, but the stderr is /home/bork/.xsession-errors. I didn’t know that! It turns out this is also a great way to find out where a process that you didn’t start is redirecting its output.

Where else do my programs redirect their stderr? Let’s see! I looked at everything’s stderr, got awk to pull out just the file, and ran uniq to get the counts.

$ ls -l /proc/*/fd/2 | awk '{print $11}' | sort | uniq -c
      42 /dev/null
      2 /dev/pts/0
      1 /dev/pts/1
      3 /dev/pts/2
      2 /dev/pts/3
      2 /dev/pts/4
      5 /dev/pts/5
      1 /dev/pts/7
     25 /home/bork/.xsession-errors

So mostly /dev/null, some of them are running on terminals (/dev/pts/*), and the rest to ~/.xsession-errors. No huge surprises here.

What else could we use these file descriptors for? Someone on Twitter suggested this:

This works because when you open different files again and again in a loop, it will usually end up with teh same file descriptor. You could also do the same thing by running strace -etrace=open -p$TARSNAP_PID to see which files Tarsnap is opening.

Okay, now we know that we can use /proc to learn about our processes’ files! What else?

Spy on your processes with /proc/$pid/status

If you look at the file /proc/$pid/status, you can find out all sorts of information about your processes! You can look at this for any process.

Here’s a sample of what’s in that file:

Name:   chrome
Groups: 4 20 24 27 30 46 104 109 124 1000 
VmPeak:   853984 kB
VmSize:   670392 kB
VmData:   323264 kB
VmExe:     96100 kB
Threads:        3
Cpus_allowed_list:      0-7

So we can see there’s some information about the memory, its name, its groups, its threads, and which CPUs it’s allowed to run on.

But wait! We could have found out a lot of this information with ps aux. How does ps do it? Let’s find out!

$ strace -f -etrace=open ps aux
...
open("/proc/30219/stat", O_RDONLY)      = 6
open("/proc/30219/status", O_RDONLY)    = 6
open("/proc/30219/cmdline", O_RDONLY)   = 6
...

So ps gets its information from /proc! Neat.

I’m sold. What else is there?!!

I tweeted asking for suggestions of things to find in /proc/, and someone replied linking to the /proc man page. I thought they were trolling me, but then I clicked on it and it was actually useful!

A few more things I need to investigate:

  • the procps and sysstat utilities
  • a ton of wonderful suggestions by Keegan McAllister on the Ksplice blog (including how to force a program to take stdin if it doesn’t take stdin)
  • /sys replaces part of /proc’s functionality.
  • Plan 9 / Inferno took this “everything is a file” business even more seriously than Linux does
  • debugfs / ftrace. An example someone linked to.

I still feel like there are concrete uses for /proc/ that I don’t know about, though. What are they?

My Rust OS will never be finished (and it’s a success!)

In November/December last year, I spent 3 weeks working on a toy operating system in Rust. (for more, see Writing an OS in Rust in tiny steps, and more).

I wrote a ton of blog posts about it, and I gave a talk about the process at Mozilla last week (the video). At that talk, a few people asked me if I was going to finish the project. I said no, and here’s why.

There are lots of reasons for working on programming projects. Just a few:

  • to end up with useful code
  • to learn something
  • to explore a new concept (see: Bret Victor’s demos)

The reason I wrote an operating system in Rust wasn’t so that I could have an operating system written in Rust. I already have an kernel on my computer (Linux), and other people have already written Rust operating systems better than I have. Any code that I write in 3 weeks is at best a duplication of someone else’s work, and mimicking the state of the art 20 years ago.

I worked on that project to learn about how operating systems work, and that was a huge success. I read a 20-part essay about linkers, and learned about virtual memory, how executables are structured, how program execution works, how system calls work, the x86 boot process, interrupt handlers, keyboard drivers, and a ton of other things.

Another amazing example of a project like this is @kellabyte’s Haywire, a HTTP server in C she wrote to learn more about writing performant code. It actually compiles and you can benchmark it yourself, but her blog posts are more useful to me than her code – Hello haywire HTTP response caching in Haywire, Further reducing memory allocations and use of string functions in Haywire.

So when people ask me why my code doesn’t compile, it’s because the code is basically a trivial output of the process. The blog posts I wrote are much more important, because they talk about what I learned. My code probably won’t be useful to you – it would be better to start with rustboot and take your own path.

Not finishing your project doesn’t mean it’s not a success. It depends what your goals are the for the project! I wrote an operating system in Rust to learn, and I learned a ton. It’s not finished, and it won’t be. How could it ever be? I hope to not ever finish learning.

Writing an OS in Rust in tiny steps (Steps 1-5)

I’m giving a talk tomorrow on writing a kernel in Rust.

My experience of writing a kernel that it was like jumping in puddles: it’s a lot of fun, and there are a lot of mishaps:

Here are a few of the tiny steps I took. There are more, but those will have to wait for the evening.

Step 1: copy some code from the internet

I didn’t know what I was doing, so I didn’t want to start from scratch! So I started with something that already existed! Behold rustboot, a tiny 32-bit kernel written in Rust.

Rustboot does only two things, but it does them well!

  1. Turn the screen red
  2. Hang

Of course what it actually does is a bit more complicated – there’s

  • a loader written in assembly
  • a Makefile that lets you run it with qemu
  • Some Rust code to clear the screen

Here’s the code that clears the screen:

1
2
3
4
5
unsafe fn clear_screen(background: Color) {
    range(0, 80*25, |i| {
        *((0xb8000 + i * 2) as *mut u16) = (background as u16) << 12;
    });
}

What does this mean? The key part here is that the address of the VGA buffer is 0xb8000, so we’re setting some bytes there. And there’s a loop.

Step 2: Turn the screen blue instead.

The first thing I did was:

  1. Make sure I could run rustboot.
  2. Change ‘red’ to ‘blue’ and run it again

This sounds silly, but psychologically it’s an important step! It forced me to look at the code and understand how it worked, and it was really exciting that it worked right away.

Step 3: Start writing I/O functions

The next obvious step now that I had a blue screen was to try to write a print function.

Here’s what it looked like!

1
2
3
4
5
6
pub fn putchar(x: u16, y: u16, c: u8) {
    let idx : uint =  (y * VGA_WIDTH * 2 + x * 2) as uint;
    unsafe {
        *((VGA_ADDRESS + idx) as *mut u16) = make_vgaentry(c, fg_color, bg_color);
    }
}

I didn’t explain the unsafe block before. Everything inside unsafe{} is unsafe code. This particular code is unsafe because it accesses a memory address directly. Wrapping it in an unsafe block tells Rust “okay, I checked and I promise this code is actually doing the right thing and won’t blow anything up”.

We can also look at make_vgaentry:

1
2
3
4
fn make_vgaentry(c: u8, fg: Color, bg: Color) -> u16 {
    let color = fg as u16 | (bg as u16 << 4);
    return c as u16 | (color << 8);
}

In the VGA buffer, each character is represented by 2 bytes (so a u16). The lower 8 bits are the ASCII character, and the upper 8 bits are the foreground and background colour (4 bits each). Color here is an enum so that I can refer to Red or Green directly.

I found this part pretty approachable and it didn’t take too long. Which isn’t to say that I didn’t have problems! I had SO MANY PROBLEMS. Most of my problems were to do with arrays and string and iterating over strings. Here’s some code that caused me much pain:

1
2
3
4
5
6
pub fn write(s: &str) {
    let bytes : &[u8] = as_bytes(s);
    for b in super::core::slice::iter(bytes) {
        putc(*b);
    }
}

This code looks simple! It is a lie. Friends. Here were some questions that I needed to ask to write this code.

  • How do I turn a string into a byte array? (as_bytes())
  • What is the type of a byte array? (&[u8])
  • How do I iterate over a byte array? (+ “it still doesn’t work!”, 4 times)

Also, what is this super::core::slice::iter business? This brings us to a fairly long digression, and an important point

Why you can’t write a kernel in Python

So you want to write an operating system, let’s say for x86. You need to write this in a programming language!

Can you write your operating system in Python (using CPython, say)? You cannot. This is not being curmudgeonly! It is actually just not possible.

What happens when you write print "Hello!" in Python?

Well, many things happen. But the last thing that happens is that the CPython interpreter will do something like printf("Hello"). And you might think, well, maybe I could link against the code for printf somehow!

But what printf does is it calls the write() system call. The write() system call is implemented IN YOUR KERNEL.

OH WAIT YOU DON’T HAVE A KERNEL YET. YOU ARE WRITING ONE.

This also means that you can’t write a kernel as a “normal” C program which includes C libraries. Any C libraries. All C libraries for Linux are built on top of some version of libc, which makes calls to the Linux kernel! So if you’re writing a kernel, this doesn’t work.

Why you can write a kernel in Rust

Writing Rust code has many of the same problems, of course! By default, if you compile a Rust program with a print statement, it will call your kernel’s equivalent to write.

But! Unlike with Python, you can put #[no_std] at the beginning of your Rust program.

You lose a lot! You can no longer

  • allocate memory
  • do threading
  • print anything
  • many many more things

It’s still totally fine to define functions and make calculations, though. And you can of course define your own functions to allocate memory.

You also lose things like Rust’s iterators, which is sad!

rust-core

rust-core is “a standard library for Rust with freestanding support”. What this means is that if you’re writing an OS, rust-core will provide you with all kinds of helpful data structures and functions that you lost when you wrote #[no_std].

I found using this library pretty confusing, but the author hangs out in IRC all the time and was really friendly to me, so it wasn’t a huge problem.

So back to super::core::slice::iter! This says “iterate over this using an iteration function from rust-core

Step 4: keyboard interrupts!

So it took me a few days to learn how to print because I needed to learn about freestanding mode and get confused about rust-core and at the same time I didn’t really understand Rust’s types very well.

Once that was done, I wanted to be able to do the following:

  1. Press a key (‘j’ for example)
  2. Have that letter appear on the screen.

I thought this wouldn’t be too hard. I was pretty wrong.

I wrote about what went wrong in After 5 days, my OS doesn’t crash when I press a key.

It lists all my traumas in excruciating detail and I won’t repeat them here. Go read it. It’s kinda worth it. I’ll wait.

Step 5: malloc!

After I’d done that, I thought it might be fun to be able to allocate memory.

You may be surprised at this point. We have printed strings! We have made our keyboard work! Didn’t we need to allocate memory? Isn’t that… important?

It turns out that you can get away without doing it pretty easily! Rust would automatically create variables on the stack for me, so I could use local variables. And for anything else I could use global variables, and the space for those was laid out at compile time.

But allocating memory seemed like a fun exercise. To allocate something on the heap in Rust, you can do

let a = ~2

This creates a pointer to a 2 on the heap. Of course, we talked before about how there is no malloc! So I wrote one, and then made sure that Rust knew about it.

You can see the malloc function I wrote in Writing malloc wrong, for fun

The hardest parts of this were not writing the function, but

  • getting the type right
  • Understanding how Rust’s language features can be turned on and off.

WHAT DO YOU MEAN TURNED ON AND OFF, you may ask!

So in rust-core, if you go to heap.rs, you’ll see this code:

1
2
3
4
5
6
7
8
9
10
11
12
#[lang = "exchange_malloc"]
pub unsafe fn alloc(size: uint) -> *mut u8 {
    if size == 0 {
        0 as *mut u8
    } else {
        let ptr = malloc(size);
        if ptr == 0 as *mut u8 {
            out_of_memory()
        }
        ptr
    }
}

This weird-looking #[lang = "exchange_malloc"] bit means “Code like let x = ~2 is now allowed to work”. It requires there to be an implementation of malloc, which I wrote. It also needs implements of realloc and free, but I left those blank :)

Before seeing that, Rust would not compile code that allocated memory.

I think this language feature gating is really cool: it means that you can write Rust programs that can allocate memory, but not do threading. Or that can do hardly anything at all!

I need to get up now.

Next up: running problems! AND SOMETHING IS ERASING MY PROGRAM WHILE IT IS RUNNING.

Debugging shared library problems with strace

It’s official. I have a love affair with strace.

So strace is this Linux command that shows you what system calls a program calls.

This doesn’t sound so useful until you find out that it is useful FOR EVERYTHING. Seriously. strace is like an immersion blender. I use strace more than my immersion blender.

Previously we have used strace to find out how killall works, spy on ssh, avoid reading Ruby code, and more.

So today I had was trying to install the IRuby notebook. But my version of libzmq was wrong! So I upgraded it. But it was STILL WRONG. Why? WHY?

So I thought, I will get strace to tell me which shared libraries are being loaded! strace will never lie to me. Here’s how to do that:

1
2
strace -f -o /tmp/iruby_problems ~/clones/iruby/bin/iruby notebook
grep libzmq.so /tmp/iruby_problems | grep -v ENOENT

The grep -v ENOENT is because it looks everywhere in my LD_LIBRARY_PATH so it fails to find libzmq a bunch of times. This reveals the following two system calls:

1
2
28863 open("/opt/anaconda/lib/python2.7/site-packages/zmq/utils/../../../../libzmq.so.3", O_RDONLY|O_CLOEXEC) = 9
28910 open("/usr/lib/libzmq.so", O_RDONLY|O_CLOEXEC) = 9

AH HA. The first libzmq is the right version (libzmq.so.3), but the second one is all wrong! It is libzmq1 and it is a disaster and a disgrace. I did sudo apt-get remove libzmq1 and the offending libzmq was banished from my system.

Thanks, strace :)

Hacker School’s Secret Strategy for Being Super Productive (or: Help.)

(this was originally called “Help”, but instead we’re being UpWorthy today)

At Hacker School, people who are new to programming learn incredibly fast. Hacker Schoolers learn Clojure and Scala and Erlang and Python and Ruby and Haskell and web programming and sockets. They write compilers and BitTorrent clients and generate music and create new programming languages and make games. At Hacker School, people get dramatically better at programming. It’s almost a magical environment, and there are many reasons that it’s like this.

But I think one of the most important things is this:

You can always get help. Everyone takes responsibility for helping everyone else.

So, a few ways to think about help in your community, workplace, or project:

Helping saves everyone’s time.

If you have a question which will take you 2 hours to answer on your own, and 20 minutes for someone to help you with, then that person helping you saves 80 minutes of someone being frustrated.

Math!

Helping isn’t handholding.

There’s this aphorism “Give someone a fish, and they’ll eat for a day. Teach them to fish and they’ll eat for the rest of their life.”

Note that this says “teach them to fish”, not “give them a disassembled fishing rod and a manual and a map and tell them that it’s all there”. I see the latter pretty often in the world of open source, and people defend it by saying that they can’t hold everyone’s hand. There’s something in between.

Asking questions is efficient and responsible.

Asking questions at work can be scary. However! If you ask a question that saves 6 hours of your time and takes someone 30 minutes to answer, that’s an amazing use of time. I think of it as my responsibility to ask questions like this.

It needs to be okay to ask questions.

This is so important. If somebody asks you a basic question and you make fun of them or act super surprised that they don’t already, they’re going to ask less questions. And then they’re going to get less stuff done.

If someone asks you “hey, who’s Nelson Mandela?”, an inappropriate answer is “oh you don’t know?!! He’s so important!”. An appropriate answer would be “He was a South African anti-apartheid revolutionary…”.

At Hacker School there’s a huge emphasis on not acting surprised when people ask questions that might seem basic, and so people feel safer asking questions when they’re stuck.

Help turns self-directed and autonomous people into superheroes

There’s a notion sometimes that people who can learn on their own don’t need any help at all – they’ll figure it out!

And they kind of will! But it will be slow and painful and inefficient.

What I saw at Hacker School was that the amazing support that was available turned self-directed and autonomous people into superheroes. I got things done much more quickly, learned much faster, and did things I absolutely wouldn’t have been able to do otherwise.

Having 65 people in a room with you who are all willing to help you get unstuck is invaluable.

Helping people is doing work.

Sometimes I hear people in work environments say “I don’t have time, I have work to do!”

It’s important to think of supporting people and answering questions as a core part of your work, not something tangential. For everyone. It saves everyone’s time. It makes your team more efficient.

Answering questions is also an amazing way to learn. I often find that I don’t understand things as well as I think I did.

You have time to help people.

I’d also like to address the “I don’t have time” point with a concrete suggestion.

Everyone has things that they do when they’re stuck on something or need a short break (check Facebook, go read Twitter, whatever). At Hacker School when I was stuck, I’d often go on our internal chat system and answer questions!

As far as I could tell everyone else also did this. The result was that you could get answers to your questions super quickly.

A thought experiment

What if everybody asked questions when they needed help?

What if helping people was everyone’s default procrastination method?