What happens when you run 'Hello, world'
So today I experimented with a new way of learning – I wanted to understand what happens when I run a “hello world” program, but I wasn’t at Hacker School. So I wrote down my current understanding and a bunch of questions and asked Twitter!
if anyone has too much time and operating system knowledge, I'd love comments and "well, actually"s on https://t.co/YqSyV5ap4Q
— Julia Evans (@b0rk) November 28, 2013
People left me tons of helpful comments in the gist, which made me really happy.
I’m not going to reprise all of the discussion here, but here’s an incomplete summary of what needs to happen when a kernel runs an executable. If you’re interested, definitely check out the gist.
The question: If I were an OS, what would I need to do to run “Hello, world?”
The original program was
#include <stdio.h>
int main() {
printf("Hello!\n");
}
and I statically compiled it by running gcc -static -o hello hello.c
.
So we don’t have to worry about dynamic linking or anything. (I very
much enjoyed
this guide to linkers,
tangentially)
Step 0: Simplify the program a bit
The first suggestion I got was to make it a bit easier by using
write()
instead of printf()
.
Running strace ./hello
tells me all the system calls that happen,
including the write()
system call:
write(1, "Hello world!\n", 13)
So we can simplify this program down to
int main() {
write(1, "Hello world!\n", 13);
}
which removes the #include
and some of the system calls. printf()
is a pretty complicated function, so it’s better to not use it.
Now we can get down to the actual business of describing what happens when the program executes! These are not in any particular order.
Load the code (“text”) into memory
In the binary there are a bunch of assembler instructions. These need to be loaded into memory.
Load the data segment into memory
A program might also have initialized and uninitialized global variables. These need a place in memory.
I’d need to zero out the BSS out here for sure.
Set up the heap and stack
Programs need a heap and a stack.
Once these three things are done, we have the program’s “address space” in memory. This looks something like this (thanks to @danellis for the diagram!)
+---------------+ | Stack | | | | | v | +---------------+ : : +---------------+ | ^ | | | | | Heap | +---------------+ | Data | +---------------+ | Code | +---------------+
I’m still really not sure about the details of what this set up looks like – people talk a lot about virtual memory and I don’t know how I would implement that at all or if I would have to implement it.
Handle system calls
User space programs interact with the kernel through “system calls”.
If I run strace -o hello.out ./hello
, I get this list of all the
system calls that happen when running ./hello
:
execve("./hello2", ["./hello2"], [/* 59 vars */]) = 0
uname({sys="Linux", node="kiwi", ...}) = 0
brk(0) = 0xca9000
brk(0xcaa1c0) = 0xcaa1c0
arch_prctl(ARCH_SET_FS, 0xca9880) = 0
brk(0xccb1c0) = 0xccb1c0
brk(0xccc000) = 0xccc000
write(1, "Hello world!\n", 13) = 13
exit_group(13) = ?
I don’t think I have to worry about the first two system calls, since the first one is definitely called by my shell.
The brk
system call is about moving the “program break” to allocate
memory. I’m not totally sure why it needs to allocate memory, but it
does.
The write
system call I definitely feel like I could handle – I
found an example on the OSDev wiki
of how to write to a VGA buffer, so that could work.
I’m guessing exit_group
is about quitting the program, so I’d have
to do some cleanup or something. I have no idea what arch_prctl
is.
I’m hoping to actually do some of this in the coming week at Hacker School. I’ve been pointed to the OSDev wiki which has all kinds of fantastic explanations and tutorials.