Yesterday I ran into a weird error where I ran a program and got the error “file not found” even though the program I was running existed. It’s something I’ve run into before, but every time I’m very surprised and confused by it (what do you MEAN file not found, the file is RIGHT THERE???!!??)
So let’s talk about what happened and why!
Let’s start by showing the error message I got. I had a Go program called serve.go, and I was trying to bundle it into a Docker container with this Dockerfile:
FROM golang:1.17 AS go ADD ./serve.go /app/serve.go WORKDIR /app RUN go build serve.go FROM alpine:3.14 COPY --from=go /app/serve /app/serve COPY ./static /app/static WORKDIR /app/static CMD ["/app/serve"]
- Builds the Go program
- Copies the binary into an Alpine container
Pretty simple. Seems like it should work, right?
But when I try to run
/app/serve, this happens:
$ docker build . $ docker run -it broken-container:latest /app/serve standard_init_linux.go:228: exec user process caused: no such file or directory
But the file definitely does exist:
$ docker run -it broken-container:latest ls -l /app/serve -rwxr-xr-x 1 root root 6220237 Nov 16 13:27 /app/serve
So what’s going on?
idea 1: permissions
At first I thought “hmm, maybe the permissions are wrong?”. But this can’t be the problem, because:
- permission problems don’t result in a “no such file or directory” error
- in any case when we ran
ls -l, we saw that the file was executable
(I’m including this even though it’s “obviously” wrong just because I have a lot of wrong thoughts when debugging, it’s part of the process :) )
idea 2: strace
Then I decided to use strace, as always. Let’s see what stracing
/app/serve/ looks like
$ docker run -it broken-container:latest /bin/sh $ /app/static # apk add strace (apk output omitted) $ /app/static # strace /app/serve execve("/app/serve", ["/app/serve"], 0x7ffdd08edd50 /* 6 vars */) = -1 ENOENT (No such file or directory) strace: exec: No such file or directory +++ exited with 1 +++
This is not that helpful, it just says “No such file or directory” again. But
at least we know that the error is being thrown right away when we run the
evecve system call, so that’s good.
Interestingly though, this is different from what happens when we try to strace a nonexistent binary:
$ strace /app/asdf strace: Can't stat '/app/asdf': No such file or directory
idea 3: google “enoent but file exists execve”
I vaguely remembered that there was some reason you could get an
when executing a program even if the file did exist, so I googled it. This led me
to this stack overflow answer
which said, very helpfully:
When execve() returns the error ENOENT, it can mean more than one thing:
1. the program doesn’t exist;
2. the program itself exists, but it requires an “interpreter” that doesn’t exist.
ELF executables can request to be loaded by another program, in a way very similar to
#!/bin/somethingin shell scripts.
That answer says that we can find the interpreter with
readelf -l $PROGRAM | grep interpreter. So let’s do that!
step 4: use
I didn’t have
readelf installed in the container and I wasn’t sure how to
install it, so I ran
mount to get the path to the container’s filesystem and
readelf from the host using that overlay directory.
(as an aside: this is kind of a weird way to do this, but as a result of writing a containers zine I’m used to doing weird things with containers and I think doing weird things is fun, so this way just seemed fastest to me at the time. That trick won’t work if you’re on a Mac though, it only works on Linux)
$ mount | grep docker overlay on /var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/merged type overlay (rw,relatime,lowerdir=/var/lib/docker/overlay2/l/326ILTM2UXMVY64V7JFPCSDSKG:/var/lib/docker/overlay2/l/MGGPR357UOZZWXH3SH2AYHJL3E:/var/lib/docker/overlay2/l/EEEKSBSQ6VHGJ77YF224TBVMNV:/var/lib/docker/overlay2/l/RVKU36SQ3PXEQAGBRKSQRZFDGY,upperdir=/var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/diff,workdir=/var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/work,index=off) $ # (then I copy and paste the "merged" directory from the output) $ readelf -l /var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/merged/app/serve | grep interp [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] 01 .interp 03 .text .plt .interp .note.go.buildid
Okay, so the interpreter is
And sure enough, that file doesn’t exist inside our Alpine container
$ docker run -it broken-container:latest ls /lib64/ld-linux-x86-64.so.2
step 5: victory!
Then I googled a little more and found out that there’s a
container that’s meant for doing Go builds targeted to be run in Alpine.
I switched to doing my build in the
golang:alpine container and that fixed
question: why is my Go binary dynamically linked?
The problem was with the program’s interpreter. But I remembered that only dynamically linked programs have interpreters, which is a bit weird – I expected my Go binary to be statically linked! What’s going on with that?
First, I double checked that the Go binary was actually dynamically linked using
ldd lists the dependencies of a dynamically linked executable! it’s very useful!)
(I’m using the docker overlay filesystem to get at the binary inside the container again)
$ file /var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/merged/app/serve /var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/merged/app/serve: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=vd_DJvcyItRi4Q2RD0WL/z8P4ulttr6F6njfqx8CI/_odQWaUTR2e38bdHlD0-/ikjsOjlMbEOhj2qXv5AE, not stripped $ ldd /var/lib/docker/overlay2/1ed587b302af7d3182135d02257f261fd491b7acf4648736d4c72f8382ecba0d/merged/app/serve linux-vdso.so.1 (0x00007ffe095a6000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f565a265000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f565a099000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f565a2b4000)
Now that I know it’s dynamically linked, it’s not that surprising that it didn’t work on a different system than it was compiled on.
Some Googling tells me that I can get Go to produce a statically linked binary by setting
CGO_ENABLED=0. Let’s see if that works.
$ # first let's build it without that flag $ go build serve.go $ file ./serve ./serve: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=UGBmnMfFsuwMky4-k2Mt/RaNGsMI79eYC4-dcIiP4/J7v5rNGo3sNiJqdgNR12/eR_7mqqrsil_Lr6vt-rP, not stripped $ ldd ./serve linux-vdso.so.1 (0x00007fff679a6000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f659cb61000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f659c995000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f659cbb0000) $ # and now with the CGO_ENABLED_0 flag $ env CGO_ENABLED=0 go build serve.go $ file ./serve ./serve: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=Kq392IB01ShfNVP5TugF/2q5hN74m5eLgfuzTZzR-/EatgRjlx5YYbpcroiE9q/0Fg3zUxJKY3lbsZ9Ufda, not stripped $ ldd ./serve not a dynamic executable
It works! I checked, and that’s an alternative way to fix this bug – if I just set the
CGO_ENABLED=0 environment variable in my build container, then I can build a
static binary and I don’t need to switch to the
golang:alpine container for my builds. I
kind of like that fix better.
And statically linking in this case doesn’t even produce a bigger binary (for some reason it seems to produce a slightly smaller binary?? I don’t know why that is)
I still don’t understand why it’s using cgo here, I ran
env | grep CGO and I
definitely don’t have
CGO_ENABLED=1 set in my environment, but I
don’t feel like solving that mystery right now.
that was a fun bug!
I thought this bug was a nice way to see how you can run into problems when compiling a dynamically linked executable on one platform and running it on another one! And to learn about the fact that ELF files have an interpreter!
I’ve run into this “file not found” error a couple of times, and it feels kind of mind bending because it initially seems impossible (BUT THE FILE IS THERE!!! I SEE IT!!!). I hope this helps someone be less confused if you run into it!