Investigating Erlang by reading its system calls
I was helping debug a performance problem (this networking puzzle) in an Erlang program yesterday. I learned that Erlang is complicated, and that we can learn maybe 2 things about it by just looking at what system calls it’s running.
Now – I have never written an Erlang program and don’t really know anything about Erlang, so “Erlang seems complicated” isn’t meant as a criticism so much as an observation and something I don’t really understand. When I’m debugging a program, whether I know the programming language it’s written in or not, I often use strace to see what system calls it runs. In my few experiments so far, the Erlang virtual machine runs a TON of system calls and I’m not sure exactly what it’s doing. Here are some experimental results.
I write 4 programs: hello.c, hello,java, hello.erl, and hello.py. Here they are.
#include <stdio.h>
int main() {
printf("hello!\n");
}
class Hello {
public static void main(String[] args) {
System.out.println("hello!");
}
}
-module(hello).
-export([hello_world/0]).
hello_world() ->
io:fwrite("Hello, world!\n").⏎
print "hello"
Here are the number of system calls each of these programs made: (you can see the full strace output here). You can generate this yourself with, for instance, strace -f -o python.strace python hello.py
wc -l *.strace
38 c.strace
1550 python.strace
2699 java.strace
15043 erlang.strace
Unsurprisingly, C comes in at the least. I was surprised that the Erlang VM runs 6 times as many system calls as Java – I think of Java as already being pretty heavyweight. Maybe this is because Erlang starts up processes on all my cores? The variety of system calls is also interesting to see: I put the system call frequencies in a gist too.
When you look at the system call frequencies, you can see that Erlang is running significantly different kinds of system calls than Java and Python and C. Those 3 languages are mostly doing open
, read
, lseek
, stat
, mmap
, mprotect
, fstat
– all activities around reading a bunch of files & allocating memory, which is what I think of as normal behavior when starting a program.
The top 2 syscalls for the Erlang process are futex
and sched_yield
. So there’s a lot of synchronization happening (the futex), and the operating system threads Erlang starts up keep scheduling themselves off the CPU “ok, I’m done, you go!”. There are also a lot of mysterious-to-me ppoll
system calls. So Erlang seems like a programming language with really significantly different primitives.
This highly concurrent behavior is consistent with what Wikipedia article says:
Erlang’s main strength is support for concurrency. It has a small but powerful set of primitives to create processes and communicate among them.
Let’s look a little more carefully at these ppoll
system calls for a second. The story starts with
8682 openat(AT_FDCWD, "/sys/devices/system/node/node0", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
8703 ppoll([{fd=4, events=POLLIN|POLLRDNORM}, {fd=0, events=POLLIN|POLLRDNORM}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
I have no idea what /sys/devices/system/node/node0
is, but it seems to be a directory and what ppoll
is looking for changes to? I don’t really get this at all.
One last thing – erlang runs bind
once when it starts. Why does it need to listen on a TCP socket to run hello world? I was very confused about this and unable to figure it out. Some people on twitter thought it might have something to do with epmd
, but epmd
seems to be a separate process. So I don’t know what’s going on.
<3 operating systems
I wanted to write this down because, as you all very well know, I think it’s interesting to take an operating systems-level approach to understanding what a program is doing and I thought this was a cool example of that.
I had this interesting experience yesterday where I was looking at this Erlang problem with Victor and David and they had OS X machines and I was like “dude I can’t debug anything on OS X”. So we got it working on my laptop and then I could make a lot more progress. Because now I’m pretty good at OS-level debugging tools, and I’ve spent a lot of time learning about Linux, and so I’m not super comfortable on non-Linux systems. (I know, I know, dtrace is amazing, I’m going to learn it one day soon, I promise :) )