I used to think that executables were totally impenetrable. I’d compile a C program, and then that was it! I had a Magical Binary Executable that I could no longer read.
It is not so! Executable file formats are regular file formats that you can understand. I’ll explain some simple tools to start! We’ll be working on Linux, with ELF binaries. (binaries are kind of the definition of platform-specific, so this is all platform-specific.) We’ll be using C, but you could just as easily look at output from any compiled language.
Let’s write a simple C program,
1 2 3 4 5
Then we compile it (
gcc -o hello hello.c), and we have a binary called
hello. This originally seems impenetrable (how do we even binary?!),
but let’s see how we can investigate it! We’re going to learn what
symbols, sections, and segments are. At a high level:
- symbols are like function names, and are used to answer “If I call
printfand it’s defined somewhere else, how do I find it?”
- symbols are organized into sections – code lives in one section
.text), and data in another (
- sections are organized into segments