DNS woes in nodejs alpine docker images
in gitlab-runner with node-23-alpine:
npm install throws a DNS error unable to resolve a dependency referencing github.com
but prior dig, and node -e dns.resolve do resolve the address
however wget fails with DNS error.
if we run the same dockerfile outside of gitlab-runner, wget is successful.
Why is this?
> [ 9/13] RUN wget -v https://github.com/lovell/sharp-libvips/releases/download/v8.14.5/libvips-8.14.5-linuxmusl-x64.tar.br:
0.571 --2025-05-24 19:06:34-- https://github.com/lovell/sharp-libvips/releases/download/v8.14.5/libvips-8.14.5-linuxmusl-x64.tar.br
0.588 Resolving github.com (github.com)... failed: Name does not resolve.
0.589 wget: unable to resolve host address 'github.com'
if we check our DNS server requests on the failing gitlab-runner, we see 2 requests:
one for A and one for AAAA. A responds with an ipv4 IP, AAAA responds with no data (no ipv6 resolved)
the same requests happen with running our alpine Dockerifle outside of gitlab-runner
so why does one wget ignore ipv4 and the other does not from the same docker file?
perhaps the answer lies in our DNS responses
on the gitlab-runner, for our AAAA query we get an NXDOMAIN status and no Authority section. This is telling wget
that the domain github.com does not exist at all (which is false)
#10 [ 6/12] RUN dig github.com AAAA github.com A
#10 0.232
#10 0.232 ; <<>> DiG 9.18.37 <<>> github.com AAAA github.com A
#10 0.232 ;; global options: +cmd
#10 0.244 ;; Got answer:
#10 0.244 ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 50409
#10 0.244 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
#10 0.244
#10 0.244 ;; QUESTION SECTION:
#10 0.244 ;github.com. IN AAAA
#10 0.244
#10 0.244 ;; Query time: 0 msec
#10 0.244 ;; SERVER: 192.168.65.1#53(192.168.65.1) (UDP)
#10 0.244 ;; WHEN: Sat May 24 20:02:34 UTC 2025
#10 0.244 ;; MSG SIZE rcvd: 28
#10 0.244
#10 0.244 ;; Got answer:
#10 0.244 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46500
#10 0.244 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
#10 0.244
#10 0.244 ;; QUESTION SECTION:
#10 0.244 ;github.com. IN A
#10 0.244
#10 0.244 ;; ANSWER SECTION:
#10 0.244 github.com. 0 IN A 140.82.116.4
#10 0.244 github.com. 0 IN A 140.82.116.4
#10 0.244
#10 0.244 ;; Query time: 0 msec
#10 0.244 ;; SERVER: 192.168.65.1#53(192.168.65.1) (UDP)
#10 0.244 ;; WHEN: Sat May 24 20:02:34 UTC 2025
#10 0.244 ;; MSG SIZE rcvd: 60
#10 0.244
#10 DONE 0.3s
on our local Dockerfile, our AAAA query returns NOERROR status and returns an Authority section with the SOA record. This is telling wget that the domain exists, but it does not have an AAAA record.
#10 [6/9] RUN dig github.com AAAA github.com A
#10 0.255
#10 0.255 ; <<>> DiG 9.18.37 <<>> github.com AAAA github.com A
#10 0.255 ;; global options: +cmd
#10 0.282 ;; Got answer:
#10 0.282 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56113
#10 0.282 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
#10 0.282
#10 0.282 ;; OPT PSEUDOSECTION:
#10 0.282 ; EDNS: version: 0, flags:; udp: 1232
#10 0.282 ;; QUESTION SECTION:
#10 0.282 ;github.com. IN AAAA
#10 0.282
#10 0.282 ;; AUTHORITY SECTION:
#10 0.282 github.com. 853 IN SOA dns1.p08.nsone.net. hostmaster.nsone.net. 1656468023 43200 7200 1209600 3600
#10 0.282
#10 0.282 ;; Query time: 0 msec
#10 0.282 ;; SERVER: 192.168.0.100#53(192.168.0.100) (UDP)
#10 0.282 ;; WHEN: Sat May 24 19:57:54 UTC 2025
#10 0.282 ;; MSG SIZE rcvd: 123
#10 0.282
#10 0.282 ;; Got answer:
#10 0.282 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21159
#10 0.282 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
#10 0.282
#10 0.282 ;; OPT PSEUDOSECTION:
#10 0.282 ; EDNS: version: 0, flags:; udp: 512
#10 0.282 ;; QUESTION SECTION:
#10 0.282 ;github.com. IN A
#10 0.282
#10 0.282 ;; ANSWER SECTION:
#10 0.282 github.com. 60 IN A 140.82.116.4
#10 0.282
#10 0.282 ;; Query time: 16 msec
#10 0.282 ;; SERVER: 192.168.0.100#53(192.168.0.100) (UDP)
#10 0.282 ;; WHEN: Sat May 24 19:57:54 UTC 2025
#10 0.282 ;; MSG SIZE rcvd: 55
#10 0.282
#10 DONE 0.3s
We also notice a different DNS IP address on the gitlab-runner - 192.168.65.1,
What if we switch the gitlab-runner to using a non-alpine base - FROM node:23 AS base. This may help us
isolate the issue to alpine or our DNS server.
The DNS results are the same above, a 192.168.65.1 DNS server responding with NXDOMAIN. Yet wget resolves and
proceeds normally.
We can now deduce that our issue is specifically how alpine images resolve NXDOMAIN responses.
Further research reveals alpine's use of musl instead of glibc, changing the default behavior of getaddrinfo() - the root function
behind all the DNS work we've been doing up to this point.
Specifically without ai_flags defined, glibc implementations will default to ai_flags = AI_ADDRCONFIG|AI_V4MAPPED
musl implementations will default to ai_flags = 0
Well this seems like a potential rabbit hole, why don't we change our line of thinking and ask why our DNS is returning and NXDOMAIN response
without an Authority SOA? If it returned a correct NOERROR with an SOA then we wouldn't have to worry about different getaddrinfo() implementations.
Let's focus on that strange 192.168.65.1 address. Shouldn't this be the Docker routed local 127.0.0.11 address? It turns out 192.168.65.1 is the address for the Docker engine's DNS server, and the experimental dind-rootless service we are using in Gitlab Runner is the culprit!
If we change our Gitlab Runner service to the less experimental -dind, we find that our DNS is now hardcoded to Google's 8.8.8.8 DNS servers -
what? We've jumped from one hacky-hole into another hacky-hole.
Let's skip hardcoded DNS nameservers all together and modify our docker buildx build command with --network=host. Now we get
docker-in-docker without wild west DNS resolution:
10 [ 6/12] RUN dig github.com AAAA github.com A
#10 0.128
#10 0.128 ; <<>> DiG 9.18.37 <<>> github.com AAAA github.com A
#10 0.128 ;; global options: +cmd
#10 0.140 ;; Got answer:
#10 0.140 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5812
#10 0.140 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
#10 0.140
#10 0.140 ;; OPT PSEUDOSECTION:
#10 0.140 ; EDNS: version: 0, flags:; udp: 1232
#10 0.140 ;; QUESTION SECTION:
#10 0.140 ;github.com. IN AAAA
#10 0.140
#10 0.140 ;; AUTHORITY SECTION:
#10 0.140 github.com. 449 IN SOA dns1.p08.nsone.net. hostmaster.nsone.net. 1656468023 43200 7200 1209600 3600
#10 0.140
#10 0.140 ;; Query time: 0 msec
#10 0.140 ;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
#10 0.140 ;; WHEN: Sat May 24 21:49:38 UTC 2025
#10 0.140 ;; MSG SIZE rcvd: 104
#10 0.140
#10 0.140 ;; Got answer:
#10 0.140 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13518
#10 0.140 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
#10 0.140
#10 0.140 ;; OPT PSEUDOSECTION:
#10 0.140 ; EDNS: version: 0, flags:; udp: 1232
#10 0.140 ;; QUESTION SECTION:
#10 0.140 ;github.com. IN A
#10 0.140
#10 0.140 ;; ANSWER SECTION:
#10 0.140 github.com. 50 IN A 140.82.116.4
#10 0.140
#10 0.140 ;; Query time: 0 msec
#10 0.140 ;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
#10 0.140 ;; WHEN: Sat May 24 21:49:38 UTC 2025
#10 0.140 ;; MSG SIZE rcvd: 55
#10 0.140
#10 DONE 0.2s
A correct AAAA response including NOERROR status and an SOA Authority.
And also a correct 127.0.0.11 docker routed DNS server.
wget is now resolving github.com on our node alpine image.
And now we can resolve our original issue - npm install failing with getaddrinfo ENOTFOUND github.com.
Not only have we fixed our issue, but we now understand it a bit better. After all, we could have just switched to a non-alpine image and called it a day. But are we really computer scientists then?
Similar issues:
https://github.com/moby/moby/issues/47628
We can highlight this issue in a few simple commands:
docker run --rm -d --name nstest-dindrootless --privileged docker:dind-rootless
sleep 5
docker exec -it nstest-dindrootless docker -H unix://run/user/1000/docker.sock run --rm alpine /bin/sh -c "apk add bind-tools && dig github.com A github.com AAAA && wget github.com"
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 23191
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;github.com. IN AAAA
;; Query time: 14 msec
;; SERVER: 192.168.65.1#53(192.168.65.1) (UDP)
docker run --rm -d --name nstest-dind --privileged docker:dind
sleep 5
docker exec -it nstest-dind docker run --rm alpine /bin/sh -c "apk add bind-tools && dig github.com A github.com AAAA && wget github.com"
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2465
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;github.com. IN AAAA
;; AUTHORITY SECTION:
github.com. 1512 IN SOA dns1.p08.nsone.net. hostmaster.nsone.net. 1656468023 43200 7200 1209600 3600
;; Query time: 0 msec
;; SERVER: 192.168.0.100#53(192.168.0.100) (UDP)
dind-rootless actually pulls from the dind Dockerfile. The culprit here is rootlesskit which provides
its own NAT magic with VPNKit. VPNKit is responsible for creating the DNS server at 192.168.65.1.
Ah hah, I learned something here. On linux distributions, Docker does not create its own private DNS server by default. But on Docker for Mac and Docker for Windows, it uses VPNKit. What else uses VPNKit? dind-rootless!
We can confirm this by altering the code above:
docker run --rm -d --name nstest-dindrootless --privileged docker:dind-rootless
docker exec -it nstest-dindrootless docker -H unix://run/user/1000/docker.sock run --rm --dns 8.8.8.8 alpine /bin/sh -c "apk add bind-tools && dig github.com A github.com AAAA"
Now our alpine container will use 8.8.8.8 instead of dind-rootless's integrated DNS proxy.
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18993
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;github.com. IN AAAA
;; AUTHORITY SECTION:
github.com. 1160 IN SOA dns1.p08.nsone.net. hostmaster.nsone.net. 1656468023 43200 7200 1209600 3600
;; Query time: 12 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
This is all due to a bug in VPNKit, returning a hardcoded NXDOMAIN answer for any error.
