Traffic not traversing cloudflare to origin server

I have a number of devices that connect to my origin server using WebSockets. I’ve had 5 devices stop connecting since late yesterday.

The devices have been physically restarted and the ISP that they are going through, has confirmed that they can see the devices are connected to Cloudflare, however, I’m not seeing these connections ever reaching the origin server.

Other devices are connected quite happily…

Other than creating a firewall rule to allow traffic from the source server (other devices are connecting from the same source network successfully) is there any other way to see if CF is blocking or dropping these connections?

Just to check and ask, could you confirm you’re using the setup as follows from below?:

  1. Make sure there is a valid SSL certificate installed at your origin host/server (or use Cloudflare Origin CA Certificate)
  2. Make sure the SSL/TLS option is set to “Full Strict SSL”
  3. Make sure WebSockets feature is enabled at Cloudflare dashboard
  4. Use WSS scheme

And the IP hasn’changed? :thinking:

Do you get some error like 403 for those requests?

Is Cloudflare allowed to connect to your origin host/server, or rather to say devices? There is no firewall between them?

Cloudflare IP ranges:

Interesting :thinking:

Have you tried traceroute or MTR and/or ping from those device to the outside World and vice-versa after they’d been restarted?
Were you using DNS too, or WARP maybe, or some kind of a tunnel?

Thanks for the questions.

Yes, all of the above is done and checked. The only caveat is that these are IOT style hardware devices so I am not able to run traceroutes etc. These devices are also more than 1,000kms away from me and am therefore unable to do any other physical checks.

These devices run via a private APN on a GSM network. They all go via the same internet breakout from the ISP. Other devices on the same network are working perfectly.

I have added an “allow” FW on CF just to make sure the traffic from the ISP passes un-hindered.

The ISP also confirmed that they can see the devices are connected to CF, but I don not see these connections popping out the other side (origin side). They have been working fine and simply stopped a day ago. The devices in question all stopped at roughly the same time as well. They are not all in the same location, but same province (driving distance between sites)

Other than CF, there is Azure in between CF and the actual server. Azure has been setup to allow only CF IPs, both IPV4 and IPV6 in the firewall rules. I used the public CF IPs as per the CF public IPs (sorry, can’t post the actual link, but is the same as per your one above)

So it is confirmed that CF is the reason for the loss in connectivity. I’ve bypassed CF and the devices re-connected almost instantly…

I’ve not found a way to contact CF to get insights into what the reason for this might be so that it can be resolved properly. Bypassing CF is not ideal… is there some way to get them to investigate the issue?

Kindly, I’d suggest you to write a ticket to Cloudflare support due to your account and/or domain issue and share the ticket number here with us so we could escalate this issue:

  • Login to Cloudflare and then contact Cloudflare Support by clicking on the Get More Help button. If you get automatic reply, reply and indicate to it you need more help and reference to this topic
  • Or send an an e-mail to support[at]cloudflare[dot]com from your e-mail associated with your Cloudflare account

Hi @qbarnard, we also have an issue ongoing since August 16th 18:00 UTC where devices are failing to connect to Cloudflare IPs, did you get some more information about the issue?

I had a strange error where my proxied domain wasn’t working; the traffic wasn’t ever getting to my server. In the end (this was yesterday), I did the equivalent of turning it off and on again… I deleted and recreated the A records in Cloudflare and it started working again, despite everything looking the same. You should try that if you haven’t already.

Thanks for that info @TomSSL. It might be worth a shot, but think it is rather extreme for my prod env. I’ll perhaps give it a try on the specific subdomain to limit impact to customers

1 Like

@zach12 the issue still persists on my side on SSL connections as they are still proxied, but non-ssl is stable since the bypass. I did try and mail them but the ticket got closed due to the domain being on a free account. I’ll upgrade it and re-submit the ticket. This revelation does however question the number of issues I’ve had with WebSockets or the last few months and am contemplating not having CF in front of these.

As you have probably already considered, if you’re doing wildcard proxying, you could always create new proxied DNS records for each subdomain and then delete and recreate the wildcard record, then delete the subdomain records, as a safe/redundant way of doing what I did.

Unless you’ve got millions of subdomains, of course.

In any case, seeing as it’s live/prod, good luck.

I’ve logged a support request now an await further feedback. I’ve adde logflare as an app to my account and has helped with creating visibility on traffic through CF. I can see WebSockets requests, but not the ones that are an issue. I suspect that the logflare worker site further down the stack from where the issue might be and hence does not log those requests.

@TomSSL / @zach12 / fritex (sorry, can only mention 2 due to new user),

Just an update the topic. I have been in contact with Cloudflare support and engineering and they have reverted a recent change that impacts the TCP Window Scale Factor for connections. I have re-enabled proxying via CF and am currently monitoring the connections.

Early tests show successful connections through Cloudflare.

1 Like

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.