Our ticket number is 2371956 and has is not getting much love or feedback from the support team. seeing this is a paid service, i am a little frustrated over 8 days now and no appropriate response to the support ticket other than it has been escalated.
we have dropped our pools down to 1 (from 2), but we have between 2 and 5 origins within each pool and the problem just ‘comes and goes’. it can be fine for three-quarters of a day, and then it causes an issue for 2-3 hours and then starts working properly again.
are people getting around the issue by reducing their setup to 1 origin pool with 1 origin server ? (ie: not even load balancing) ?
All the tickets mentioned in this thread have been added to the escalation here. The last I heard is that it was escalated to the engineering team. I have not seen an update since then.
Just to provide an update here.
There is a confirmed trouble we are seeing.
Cloudflare is also able to reproduce the trouble and thank you to @Nosco for that setup that we were able to identify the trouble.
While the trouble is easy to reproduce, Engineering is still isolating the behaviour, of “why”. I do understand the community would like some updates on what is being done. All of your experiences should be similar, as in the session would suddenly switch to Origin2 and recover back to Origin1.
There are multiple reasons ‘why’ and Engineering needs some time to figure out, ‘why’ it happens, and ‘when’ it does happen, ‘why’ again it happens, and ‘why’ it recovers.
The Engineering Team is looking at this at high priority. Please kindly hold for Engineering to respond. Do check back on your tickets and this community thread for further updates. I will ensure this thread remains open.
We are experiencing the same problem. Despite having Cooke-based Session Affinity enabled in our Load Balancer (with IP fallback), many of our clients are being sent to the wrong origin. The problem appears to have started about two weeks ago. We have increased the lifetime of the session affinity cookie in the Load Balancer control panel, but this does not appear to have had an effect. We have created a support ticket (2379346) for the issue and would be grateful if it could be escalated along with the others mentioned in this thread.
Thank you for the report and ticket number.
I have linked the ticket to our problem Engineering ticket.
The problem ticket will be updated once Engineering responds.
Any updates at all from Engineering?
Feels very unusual that an issue of this importance which is clearly affecting quite a number of users has been allowed to continue without any updates or a resolutions for such a long time.
It’s been going for almost 2 months, our client even stopped using LB because of this, and affected our image because we were the ones who actually recommended Cloudflare for their load balancer. They wanted to use Google Cloud’s Balancer, and are now migrating to it.
I too am thinking about migrating, it’s been going on for too long and it’s really not serious. I really have the feeling that we are going to low priority or even less…
We are experiencing the same issue as described above here, i opened a support ticket as well on a pro account: 2375431. Looking forward for a fix, more customers of us are having these troubles with their load balancing. We have to turn off the LB till this works properly again.
I’m so sick of it! Does anyone know of a Cloudflare competitor that does loadbalancing?
Digital Ocean, AWS, etc.
We will likely move to Digital Ocean if this isn’t resolved fairly soon.
We received notification from Cloudflare that the LB affinity issue has had a ‘fix’ deployed overnight… not sure if others received the same confirmation.
we are about to test.
Got the same notice, will be checking it out as well.
A fix was released by Engineering. Can you test and advise if the trouble is resolved?
Superb. We have updated our users and will report if there are any problems.
We activated the balancer a few hours ago and it seems to be working fine. Thanks!
Thank you for helping to alert us to this issue. As a bit of context regarding what you’ve noticed:
Cloudflare was testing a new server module. Currently we use a c-based server module to connect to your origin. This was in place for many years.
Cloudflare has decided to update this module by rewriting one in Rust.
Following these changes we expect performance improvements and the ability to support additional asynchronous-communication features.
In certain edge cases, session affinity was not working as expected. The newly-written module was, in certain circumstances, not fetching the cookie that is set to manage this.
We have now applied a fix that will resolve the cookie not being fetched.
If you notice more issues, please do not hesitate to contact us or file a ticket so we can look into it. I will be mass closing the affected users tickets using this response here after the weekend. Please open new tickets if you notice further trouble.
This was truly a team effort, between Cloudflare and yourselves, to narrow down this pesky random behaviour which affected a large number of you.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.