Hello - we are pulling our hair out trying to understand why our site frequently experiences delays of 12s to 30s several times a day. At first we had customers notifying us of this issue, but then we noticed that NewRelic data was confirming this slowdown via browser analytics (so its not limited to a single user ISP), and equally our backend server statistics show no issues with our backend server.
Initially we approached our hosting service (Rackspace), and they could find no issues on their end and also confirmed there was no backend high cpu/resource constraints - but they pointed out that our DNS/SSL/CDN is Cloudflare (this system is one we have inherited).
In the meantime we have enabled some ping tracing in NewRelic, and a few days worth of data has confirmed that Cloudflare? frequently has extreme response times (from 2-30s) every day for our UK based domain - and pings from the UK, Ireland, Germany (as well as SA and USA) are all affected - whereas the same application in Australia fronted by the same Cloudflare account (but different domain and server provider) seems unaffected.
I am still not ruling out Rackspace, but am after some community help to understand how best to diagnose this as lower level networking as at the extremes our our knowledge.
We see things like this most days: _/|/_. (sorry can only post one image as a new user grrrr)
which last about 20 mins, normally around 11-3:30pm (but no real pattern, and site usage earlier in the day is often greater than times when it seems to kick off).
Drilling in to one of the statistics for london seems to show SSL issues (according to New Relic):
Whereas a normal request looks fine (interestingly the DNS lookup on failed requests is also suprisingly low compared to normal requests):
[image would look the same, but show several space colour bards in responses - but interestingly the DNS response 35ms)
Interestingly you can see the buildup of these slow requests as they get progressively worse:
[image would be here showing ping results building up in time]
So my question is whether this looks like something Cloudflare could be doing if something is configured incorrectly? And how can I diagnose this from the Cloudflare dashboard - which seems to show normal stuff from what I can see, but as mentioned real world feedback disagrees with this, and tools like NewRelic all seem to point to somewhere between the Browser request and the Backend server (but not the backend server which responsed in < 300ms to the eventual request that gets through).
Sorry for such a long post, but hopefully it might ring some bells for someone, and perhaps be instructive to a wider audience when someone hopefully says - oh silly - its X.