524 Origin Timeouts with Bypass and orange cloud


#1

I seem to have a problem with getting 524 errors when a website is using “orange cloud” and has a page rule of “bypass”, which we want in order to protect our API which must never be cached. I’ve been speaking to CF support, who are helpful but haven’t really resolved my problem.

A 524 implies that CF have connected to our website, but not got a response, however tracing at the socket accept() level shows no inbound connections, and no open requests being processed.
Closing our webserver down while we are waiting for a response does not cause CF to return an error faster, which suggests it is not connected to us.
Our test page is a simple HTML with 15 or so PNG images.
The server is under light load. We have moved from a local ISP to Azure, but no change.
If I switch to “grey cloud”, then everything works perfectly - but this leaks our IP address so doesn’t offer any DDOS protection that we want from CF
We tried switching the website to HTTP/1.0 and closing after each request, that had no effect.
Our webserver is very fast especially for these small static requests, in fact using CF slows the overall experience down by 20-40mS per request, but that is acceptable, we want the DDOS protection more.
If we browse direct to the origin at the same time, it never has any issues.
When reloading the page in the browser, the resource we get the 524 is random, not fixed to certain resources.
There is no FW in Azure
There is no rate limiting either (and if there was then the browser direct to origin would have issues)

When using orange cloud and normal caching we do not really see any errors, essentially the CF caching masks any issues, which you would expect

Setup to reproduce:
Add DNS for orange.example.com “orange cloud”
Add page rule cache-control=bypass
Add DNS for grey.example.com “grey cloud”
Test web page with simple HTML and 15 or so IMG tags
Browsing to orange.example.com frequently get 524 errors, but using grey.example.com has no problems,

Questions.
Does anyone else see Origin Timeouts constantly, especially for a site hosted on Azure.

How do other sites handle REST API urls that shouldn’t be cached by CF - is there a better pattern than using orange cloud and bypass page rules?

Any other troubleshooting tips? I accept we might have something setup wrong, but struggling to see what.


#2

Looks like no one had any suggestions. Hopefully we can bump this up and get some eyes. Are you still having this issue?


#3

Yes. I have a subdomain defined of Rxx.xxxxxxxxx.com with Orange cloud and a page-rule of Rxx.xxxxxxx.com/* cache-level-bypass

Our www.xxxxxxxx.com is using grey cloud (and working no issues)

the subdomain Rxx… is pointing to the same content as www… on the server, so the only difference between the two is orange/grey cloud and cache=bypass on CF

The fact that only we appear to have this problem is interesting, but we are really drawing a blank as we do not seem to even have the accept() for these requests. Although we are probably a minority wanting to stop CF from caching.

As our next step, we are going to try having server send no-cache headers and see if that helps (ie is it the page rule that is the issue.)


#4

FYI. After a lot more testing it seems that this problem is somehow related to using WSADuplicateSocket so that we can listen in multiple processes. We use multiple listening processes as our corporate customers can easily have thousands of active sockets at any moment. Even if we restrict which process calls select/accept we see issues, but as soon as we do not call WSADuplicateSocket everything works. Not sure why we only see this when behind OrangeCloud on CF though. The code is old and battle hardened. Still investigating.