Hi, we started using cloudflare recently because our API service was under attack. So of course we switched our domains to use Cloudflare dns.
Not long after, we started getting tons connection timeout errors when connecting to other servers, such as our database servers, some external 3rd party APIs, etc. It would start in the late mornings, EST, and continue until evening. We were getting complaints from customers about our API and website being slow. The servers we’re using are a mix of Digital Ocean and Vultr virtual servers, and the problem was happening on both.
We realized eventually that the problem happened only if we referred to the servers by their domain name (like db1.myserver.com), and we switched our API servers to only refer to other servers by their IP address instead. This dramatically increased the speed of our API. Our unit testing app, which does a quick test of all of our API functions started taking 61 seconds to complete where it was previously taking 170 seconds. So we assumed because of this it’s a DNS lookup issue. Our website, which was not changed, was still getting tons of connection timeout errors.
Then, yesterday morning I implemented DNSSEC for our domain, and almost instantly the connection timeout errors on our website stopped occurring. Since then, things have been running beautifully. The only exceptions are a couple 3rd party API’s we call which cannot be connected to via IP address; they are still slow; one sometimes takes as much as 10 seconds to return.
I still don’t understand what was happening. Cloudflare claims to have the fastest DNS lookup times, and I doubt the problem is with them. Is it possible we’re under some DNS related attack which was remedied by our setting up DNSSEC for our domain? What else can we do to resolve these issues, especially with the 3rd party APIs we have to use?
Any insight or advice would be greatly appreciated!!!