SERVFAIL for IBM Software Downloads

We are seeing issues with the Cloudflare public resolver trying to download files from the IBM software portal. Getting SERVFAIL from Cloudflare, but the domain resolves correctly when testing against other public resolvers (Google, Quad9, Umbrella, etc.). User is also able to download the file as expected when using another DNS resolver.

I’m not sure if there are other subdomains, but the one we are seeing with an IBM software download is delivery03-bld.dhe.ibm.com.

Dig returns the following info. The one interesting section is the extended error code hinting at some sort of issue “…at delegation dhe.ibm.com”. The fact that this resolves properly for the other public resolvers I’ve checked makes me think this is a Cloudflare resolver issue, but I’m not sure.

dig @1.1.1.1 delivery03-bld.dhe.ibm.com.

; <<>> DiG 9.10.6 <<>> @1.1.1.1 delivery03-bld.dhe.ibm.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 49168
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; OPT=15: 00 16 61 74 20 64 65 6c 65 67 61 74 69 6f 6e 20 64 68 65 2e 69 62 6d 2e 63 6f 6d 2e ("..at delegation dhe.ibm.com.")
;; QUESTION SECTION:
;delivery03-bld.dhe.ibm.com.	IN	A

;; Query time: 3007 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Mar 31 16:51:19 EDT 2022
;; MSG SIZE  rcvd: 87

Strange, works fine from here (Amsterdam), no matter how many times I try. Where are you querying from?

$ dig @1.1.1.1 delivery03-bld.dhe.ibm.com.

; <<>> DiG 9.16.1-Ubuntu <<>> @1.1.1.1 delivery03-bld.dhe.ibm.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28820
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;delivery03-bld.dhe.ibm.com.	IN	A

;; ANSWER SECTION:
delivery03-bld.dhe.ibm.com. 1793 IN	CNAME	dispby-103.boulder.ibm.com.
dispby-103.boulder.ibm.com. 893	IN	A	170.225.15.103

;; Query time: 12 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Mar 31 23:26:40 CEST 2022
;; MSG SIZE  rcvd: 104
1 Like

US East Coast. Currently hitting the Cloudflare ATL pop according to https://1.1.1.1/help

Also tested from a few different ISPs/public IPs with the same result.

I just tested it from New York’s EWR PoP and it still resolves fine. Perhaps something is wrong at ATL PoP?

Was able to test from ATL, but also resolves fine:

% dig @1.1.1.1 delivery03-bld.dhe.ibm.com. +nsid          

; <<>> DiG 9.18.1 <<>> @1.1.1.1 delivery03-bld.dhe.ibm.com. +nsid
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8139
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; NSID: 32 37 6d 33 33 38 ("27m338")
;; QUESTION SECTION:
;delivery03-bld.dhe.ibm.com.	IN	A

;; ANSWER SECTION:
delivery03-bld.dhe.ibm.com. 1756 IN	CNAME	dispby-103.boulder.ibm.com.
dispby-103.boulder.ibm.com. 856	IN	A	170.225.15.103

;; Query time: 198 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Fri Apr 01 00:28:33 CEST 2022
;; MSG SIZE  rcvd: 114
% curl -s https://cloudflare.com/cdn-cgi/trace | grep colo
colo=ATL

Yep, it is resolving for me now as well. No idea what was going on but seems to be fine at this point.

dig @1.1.1.1 delivery03-bld.dhe.ibm.com.

; <<>> DiG 9.10.6 <<>> @1.1.1.1 delivery03-bld.dhe.ibm.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15869
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;delivery03-bld.dhe.ibm.com.	IN	A

;; ANSWER SECTION:
delivery03-bld.dhe.ibm.com. 1345 IN	CNAME	dispby-103.boulder.ibm.com.
dispby-103.boulder.ibm.com. 439	IN	A	170.225.15.103

;; Query time: 15 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Mar 31 18:35:31 EDT 2022
;; MSG SIZE  rcvd: 104
curl -s https://cloudflare.com/cdn-cgi/trace | grep colo
colo=ATL

It seems like the upstream nameservers for dhe.ibm.com (e.g. esd03ns03p.mul.ie.ibm.com) are not responding to some of our colos. I’m not sure why, but I’ll try to talk to someone who’s responsible for those. I forwarded the queries somewhere else where it works as a temporary workaround.

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.