False alarm:: Cname + txt for DKIM = Microsoft Office 365 fail

Edit:: After making a modification to the CNAME record in question, Cloudflare DNS resumed serving it up normally. Other than the link to Microsoft’s explanation of what how they rotate their DKIM keys and possibly a hint that forcing a change and/or deleting and recreating a DNS entry that is causing trouble might resolve an issue, this post isn’t worth reading.

Cloudflare DNS does not respond to requests for a TXT record when the requested DNS entry is a CNAME record that points to a different server which has the TXT record. ((At least that’s what seemed to be the case)) The server responds and seems to say “there’s no information for you”. Other DNS providers will reach out to the CNAME’s target, look up the requested TXT record and return. I discovered this because we switched to Cloudflare DNS a little over 28 days ago (28 days is the TTL set by Microsoft) and at 28 days in, our Microsoft 365 outbound emails started being discarded by destination servers that perform DKIM analysis.

My first brush with support suggested that I not use the CNAME record, but instead create my own TXT entry for the DKIM record. The trouble with that suggestion is that even if I figure out what the proper public key is TODAY and create a TXT DKIM record with it, Microsoft will rotate the key at some point and our email will fail again. Well… my choices were switch to a different DNS provider or make a band-aid by renaming my active TXT record and ‘hard coding’ a DKIM record that copies the one Microsoft made for us. The band-aid was quicker, but is not the long term fix.

I’m posting this out here for two reasons: 1) as a resource for others who encounter this problem and 2) pleading for Cloudflare to allow this horribly inefficient third party lookup to occur. To prevent abuse, only let it work for the *.onmicrosoft.com domain.

Thanks for reading!

The site below explains how Microsoft does the key rotations and shows that the CNAME to TXT record lookup are essential in the process.

These are sample CNAME records Microsoft asks us to create when enabling DKIM for an Office 365 tennant:

selector1._domainkey.microsoft.com. 3600 IN CNAME selector1-microsoft-com._domainkey.microsoft.onmicrosoft.com. (Microsoft publishes this in DNS)

selector1-microsoft-com._domainkey.microsoft.onmicrosoft.com. 3600 IN TXT “v=DKIM1; k=rsa; p=<public key#1 >n=1024,1435867504,1” (Office 365 publishes this in its DNS)

selector2._domainkey.microsoft.com. 3600 IN CNAME selector2-microsoft-com._domainkey.microsoft.onmicrosoft.com. (Microsoft publishes this in DNS)

selector2-microsoft-com._domainkey.microsoft.onmicrosoft.com. 3600 IN TXT “v=DKIM1; k=rsa; p=<public key #2> n=1024,1435867505,1” (Office 365 publishes this in its DNS)

Cloudflare’s authoritative nameservers operate as expected here, when you ask for a TXT (or any other resource record for a label that contains a CNAME), Cloudflare returns the CNAME, and it is up to your resolver to follow that.

1.1.1.1 is a resolver rather than an authoritative server and it will follow the chain, giving you the expected result. You can see an example of each here.

Make sure your record isn’t proxied (this type of record should be :grey: not :orange:).

This is actually a very efficient setup as it allows the mail provider to rotate records without you doing anything. Imagine if they had to send an email to every customer every month to manually update a DNS record? Much better to let DNS servers do what they do and figure out the answer. My example is with FastMail who operates the same way, but I have customers on O365 as well.

It could be useful to know the actual domain and record so I can see how things are resolving.

1 Like

Hmmm. I’ve been chasing this for hours. In case you don’t read to the end - thanks for the reply.

Thanks to an extensive scroll back buffer, I have confirmed that TXT nslookup WERE failing using 8.8.8.8 as a resolver and I’m not (completely) insane. See below - domain changed to protect the innocent. When I decided to manually create the DKIM entry, I would first need to dispose of the old entry, but instead of deleting the selector2._domainkey CNAME record (I’m lazy - but in this case it preserved evidence) I renamed it to selector22._domainkey. After reading your post, I tried querying selector22._domainkey.MYDOM.com and it worked fine. At this point I’ve renamed some more and the original entry is working as it should.

From the scroll back machine:

set type=txt
selector2._domainkey.MYDOMAIN.com
Server: dns.google
Address: 8.8.8.8


primary name server = marvin.ns.cloudflare.com
responsible mail addr = dns.cloudflare.com
serial = 2035550095
refresh = 10000 (2 hours 46 mins 40 secs)
retry = 2400 (40 mins)
expire = 604800 (7 days)
default TTL = 3600 (1 hour)

set type=cname
selector2._domainkey.MYDOMAIN.com
Server: dns.google
Address: 8.8.8.8


primary name server = marvin.ns.cloudflare.com
responsible mail addr = dns.cloudflare.com
serial = 2035550095
refresh = 10000 (2 hours 46 mins 40 secs)
retry = 2400 (40 mins)
expire = 604800 (7 days)
default TTL = 3600 (1 hour)

selector2._domainkey.MYDOMAIN.com
Server: dns.google
Address: 8.8.8.8


primary name server = marvin.ns.cloudflare.com
responsible mail addr = dns.cloudflare.com
serial = 2035550095
refresh = 10000 (2 hours 46 mins 40 secs)
retry = 2400 (40 mins)
expire = 604800 (7 days)
default TTL = 3600 (1 hour)

First lookup of the recently re-named “selector22._domainkey.MYDOM.com” worked.

selector22._domainkey.MYDOM.com.
Server: dns.google
Address: 8.8.8.8

Non-authoritative answer:
selector22._domainkey.MYDOM.com canonical name = selector2-MYDOM-com._domainkey.MYDOM.onmicrosoft.com
selector2-MYDOM-com._domainkey.MYDOM.onmicrosoft.com text =

    "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDN3gElHz0cXCIsszQG2KXGCMxb3mpLrhvEJhCCg7/P4cze7vvo+hIZFZ2AqPdQS6H9LzW0UN39e6X19M+MBk/FFB+uHF9IYzjRJWLv2Fd+dY2FND9yTJC/zuVPZP/4baEH8vZs7gBYzj+R5GDDvv8U3naqkSV1x8U9B+gdXscpYQIDAQAB;"

How the heck is it working? Probably won’t ever know.

There’s a couple things going on. For one, when you’re using a third party resolver (Google’s, here) you need to remember to take caching effects into account, so what you are seeing may not quite reflect reality.

Also, nslookup has some really bad behaviours in certain cases, especially if your query results in a truncated response with a referral (more details

Without the real domain it’s impossible to know what is actually happening, but the way DNS works is that an authoritative server can only give you results within its own authority. This typically means within the same zone. If you have a CNAME within your zone then the answer will include the CNAME response and the final response too, but once a CNAME crossed zone boundaries you’ll just get the CNAME, and you’re left to discover the real answer yourself. In fact, while it is technically possible for Cloudflare to lookup the record from onmicrosoft and include it in an additional section, your resolver would completely ignore it because it is out of authority.

Luckily recursive resolvers take care of this for you, in theory, but in practice you need to take into account that recursive resolvers also cache results, so you may need to wait minutes to hours before getting the correct answer.

Using a tool like dig gives you a lot more control over what is happening, and it will do a lot less interpretation on the results than nslookup. It also has flags like +trace which will show you the steps for a particular record.

You can use the link I supplied above but your own domain, it will show you what Cloudflare returns authoritatively (this should be just your CNAME record) and also what Cloudflare’s resolver (1.1.1.1, similar to 8.8.8.8) returns for the final answer. You can switch resolvers to see what other places show too.

The online dig tool is very handy, I’ll put that in my toolbox next to MXToolbox.

If you’re willing, compare the cname records Dave.MYDOMAIN.com and selector1._domainkey.MYDOMAIN.com, a cname for selector1-MYDOMAIN-com._domainkey.MYDOMAIN.onmicrosoft.com The destination at Microsoft for selector1 does not seem to be valid, however I believe that they will light it up when they start the key rotation process.

Pictures below of our CNAME records - note there are no clouds for the two DKIM entries, only the text DNS only. When setting up the account, I exported DNS records to a file from DNSMadeEasy and imported that into Cloudflare - not sure if that caused trouble.

Also screenshots done through MXToolbox looking up our DKIM record yesterday and today.



…and now all the CNAME lookups are working, so no broken things to look at. :slight_smile: :frowning: :slight_smile:

Good point, I recall Microsoft doesn’t generate records until they see the CNAME, although that could have changed over the years. Some providers include a “this space left blank” record so that things are obvious.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.