Bug? SERVFAILs while requesting SOA records from acme-dns hosted domain


#1

Hello! I’ve found an issue with a Let’s Encrypt dns-01 challenge
client misbehaving while using 1.1.1.1 as DNS server, which appears to
be caused by that recursor returning SERVFAILs for SOA queries on the challenge
domain.

To reproduce:

$ dig @8.8.8.8 SOA 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca | grep "status:"

$ dig @1.1.1.1 SOA 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca | grep "status:"

The challenge process succeeds with 8.8.8.8 as it is correctly (?)
returning an NXDOMAIN. The logs of the authoritative nameserver at
acme.lfcode.ca indicate it is returning an NXDOMAIN itself:

Nov 08 05:37:13 abyss acme-dns[19514]: time="2018-11-08T05:37:13Z"
level=debug msg="Answering question for domain"
domain=[4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca](http://4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca/). qtype=SOA
rcode=NXDOMAIN
Nov 08 05:37:13 abyss acme-dns[19514]: time="2018-11-08T05:37:13Z"
level=debug msg="Answering question for domain"
domain=[4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca](http://4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca/). qtype=SOA
rcode=NXDOMAIN

The application is walking up a label at a time looking for SOA
records, but it is dazed and confused by the SERVFAIL and tries again
until it fails a minute later.

Further logs and information about the issue are available at the
GitHub issue filed about this:

Thanks!

Side-note: this forum software is horribly broken in that it aggressively looks for link-looking things (including domain names!) and mangles them into “links” when pasted in.


#2

acme.lfcode.ca has DNS issues.

https://ednscomp.isc.org/ednscomp/cf38574b18

http://dnsviz.net/d/acme.lfcode.ca/W-QcZg/dnssec/

lfcode.ca's nameservers say acme.lfcode.ca is served by abyss.lfcode.ca.

;; AUTHORITY SECTION:
acme.lfcode.ca.         300     IN      NS      abyss.lfcode.ca.

;; ADDITIONAL SECTION:
abyss.lfcode.ca.        300     IN      A       50.116.18.138
abyss.lfcode.ca.        300     IN      AAAA    2600:3c00::f03c:91ff:fec8:969e

However, the nameserver itself says:

acme.lfcode.ca.         3600    IN      NS      acme.lfcode.ca.

On its own, that inconsistency causes warnings in DNS checking tools but is harmless.

However, a query for acme.lfcode.ca with most query types says:

acme.lfcode.ca.         3600    IN      CNAME   lfcode.ca.

That’s illegal, for two reasons: The zone apex can’t be a CNAME, and the value of an NS record can’t be a CNAME.

Additionally, a query for the TXT type responds with NXDOMAIN. (Under the circumstances, the resolver probably never happened to make a TXT query and doesn’t know that, but it’s not good.)

Also, none of the responses set the Authoritative Answer bit.

Additionally, TCP doesn’t work. (But the resolver probably had no reason to try it.)

I’m only speculating, but I’d bet that either the CNAME or the AA issue is making 1.1.1.1 unhappy. It could be something else, though.


#5

I fixed the NS and now the query works correctly. The rest of it is bugs in acme-dns which I just filed.