Hello! Here are some things you may or may not have noticed about DNS:
- when you resolve a DNS name in a Python program, it checks
/etc/hosts, but when you use
dig, it doesn’t.
- switching Linux distributions can sometimes change how your DNS works, for example if you use Alpine Linux instead of Ubuntu it can cause problems.
- Mac OS has DNS caching, but Linux doesn’t necessarily unless you use
To understand all of these, we need to learn about a function called
getaddrinfo which is responsible for doing DNS lookups.
There are a bunch of surprising-to-me things about
getaddrinfo, and once I
learned about them, it explained a bunch of the confusing DNS behaviour I’d
seen in the past.
getaddrinfo come from?
getaddrinfo is part of a library called
libc which is the standard C
library. There are at least 3 versions of libc:
- glibc (GNU libc)
- musl libc
- the Mac OS version of libc (I don’t know if this has a name)
There are definitely more (I assume FreeBSD and OpenBSD each have their own version for example), but those are the 3 I know about.
Each of those have their own version of
not all programs use
getaddrinfo for DNS
The first thing I found surprising is that
getaddrinfo is very widely used
but not universally used.
Every program has basically 2 options:
getaddrinfo. I think that Python, Ruby, and Node use
getaddrinfo, as well as Go sometimes. Probably many more languages too but I did not have the time to go hunting through every language’s DNS library.
- use a custom DNS resolver function. Examples of this:
- dig. I think this is because dig needs more control over the DNS query
getaddrinfosupports so it implements its own DNS logic.
- Go also has a pure-Go DNS resolver if you don’t want to use CGo
- There’s a Ruby gem with a custom DNS resolver that you can use to replace
getaddrinfodoesn’t support DNS over HTTPS, so I assume that browsers that use DoH are not using
getaddrinfofor those DNS lookups
- probably lots more that I’m not aware of
- dig. I think this is because dig needs more control over the DNS query than
you’ll sometimes see
getaddrinfo in your DNS error messages
getaddrinfo is so widely used, you’ll often see it in error messages related to DNS.
For example if I run this Python program which looks up nonexistent domain name:
import requests requests.get("http://xyxqqx.com")
I get this error message:
Traceback (most recent call last): File "/usr/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/usr/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/usr/lib/python3.10/socket.py", line 955, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known
socket.getaddrinfo is calling libc
getaddrinfo somewhere under the
hood, though I did not read all of the source code to check.
Before you learn what
getaddrinfo is, it’s not at all obvious that
socket.gaierror: [Errno -2] Name or service not known means “that domain
doesn’t exist”. It doesn’t even say the words “DNS” or “domain” in it
getaddrinfo on Mac doesn’t use
I used to use a Mac for work, and I always felt vaguely unsettled by DNS on Mac. I could tell that something was different from how it worked on my Linux machine, but I couldn’t figure out what it was.
I still don’t totally understand this and it’s hard for me to investigate because I don’t currently have access to a Mac but here’s what I’ve gathered so far.
On Linux systems,
getaddrinfo decides which DNS resolver to talk to using a
/etc/resolv.conf. (there’s apparently some additional
/etc/nsswitch.conf but I have never looked at
/etc/nsswitch.conf so I’m going to ignore it).
For example, this is the contents of my
/etc/resolv.conf right now:
# Generated by NetworkManager nameserver 192.168.1.1 nameserver fd13:d987:748a::1
This means that to make DNS queries,
getaddrinfo makes a request to
192.168.1.1 on port 53. That’s my router’s DNS resolver.
I assumed this was
getaddrinfo on Mac also just used
/etc/resolv.conf but I was wrong.
getaddrinfo makes a request to a program called
which is a Mac thing.
I don’t know much about
mDNSResponder except that it does DNS caching and
that apparently you can clear the cache with
dscacheutil. This explains one
of the mysteries at the beginning of the post – why Macs have DNS caching and
Linux machines don’t always.
getaddrinfo is different from glibc’s version
You might think ok, Mac OS
getaddrinfo is different, but the two versions of
getaddrinfo in glibc and musl libc must be mostly the same, right?
But they have some pretty significant differences. The main difference I know about is that musl libc does not support TCP DNS. I couldn’t find anything in the documentation about it but it’s mentioned in this tweet)
I talked a bit more about this TCP DNS thing in ways DNS can break.
Some more differences:
- the way search domains (in
/etc/resolv.conf) are handled is slightly different (discussed here)
- this post mentions that musl doesn’t support nsswitch.conf. I have never used nsswitch.conf and I’m not sure why it’s useful but I think there are reasons I don’t know about.
more weird things: nscd?
When looking up getaddrinfo I also found this interesting post about getaddrinfo from James Fisher that
getaddrinfo and discovers that apparently calls some
nscd which is supposed to do DNS caching. That blog post
describes nscd as “unstable” and “badly designed” and it’s not clear to me how
widely used it is.
I don’t know anything about nscd but I checked and apparently it’s on my computer. I tried it out and this is what happened:
$ nscd child exited with status 4
My impression is that people who want to do DNS caching on Linux are more
likely to use a DNS forwarder like
systemd-resolved instead of
nscd – that’s what I’ve seen in the past.
When I first learned about all of this I found it really surprising that such a widely used library function has such different behaviour on different platforms.
I mean, it makes sense that the people who built Mac OS would want to handle
DNS caching in a different way than it’s handled on Linux, so it’s reasonable
that they implemented
getaddrinfo differently. And it makes sense that some
programs choose not to use
getaddrinfo to make DNS queries.
But it definitely makes DNS a bit more difficult to reason about.