Sorry for not providing more info in the original post, but I kept getting this error when submitting my issue: “Sorry, new users can only put 4 links in a post.” So I ended up making dozens of attempts at posting and this is the one that did it.
Here is a more detailed explanation:
We have a web app hosted on Heroku, assigned as proxied CNAMEs on our cloudflare DNS Zones, that is randomly giving us “SSL Certificate Errors” on HTTP calls for the past 6 days, or 502Bad Gateway errors:
We have contacted heroku support, here is their response:
“I have thoroughly reviewed this and found no issues on the platform. The certificate appears to be in order. Could you please check with the domain provider to see if there are any ongoing issues?”
Given this answer, and the fact that “502Bad Gateway” specifically signed as “cloudflare” is documented online to be an internal error from cloudflare proxies, we contacted Cloudflare, marking our support ticket as a “registrar issue”, here is their response:
“It appears that you may have accidentally picked the wrong category for this issue since you chose a Billing/Registrar case, but your account is a Free account. While we only offer direct support to customers who have a Pro, Business, or Enterprise account, we do offer resources for everyone.”
Since it is pretty clear that this issue is no longer actionable on our side, we are at a loss for solutions.
One of our website (wordpress) starts getting the same issue (it used to work fine). We tried temporary paused for that domain and everything worked as expected but issue returned when stop pausing. All of domain’s CNAMEs are proxied and SSL/TLSs selected as Full Strict. In addition, we tried on Firefox and get the message as in the screenshot while Edge and Chrome loaded super slow and then the issue occured.
Thank you for your help! Here is all the info I could gather.
I tried pausing Cloudflare and could not reproduce any of the errors (“SSL Certificate error” or “Bad Gateway”) when Cloudflare was paused. Doesn’t mean it’s fixed, as the error appears very randomly.
I could not give you a definite answer on whether our app is configured to listen on both 80 and 443 on the origin host, but all I can tell you is that as user @muonnoi seems to claim as well, we haven’t changed anything related to listening ports or anything major related to HTTP access in the past 5 years, and the error only started to appear in the last 2 weeks.
Yes, the CNAME rules are proxied, they have the little orange cloud. This is what I meant by “proxied CNAMEs”
This is a continuation of my previous issue, which was closed but not resolved. Our users are reporting very frequent error encounters while using our app. It can occur more than 10 times in a minute for a given user, on any type of HTTP call to our domain. It started occuring about 2 months ago, has never stopped since.
What steps have you taken to resolve the issue?
Contacted Heroku support, they say everything is fine on their end and directed me to my DNS supplier, so here I am!
I tried opening a support ticket but got rejected, so I created a topic here but it got locked without any solution.
What feature, service or problem is this related to?
I don’t know
What are the steps to reproduce the issue?
Browse the app at backend.fyctia.com (not very doable if not authenticated), interacting with its various routes until the SSL error appears.
I have further information on the issue and how to reproduce it.
It also occurs on random calls to our API from our frontend app at https://www.fyctia.com . So if you browse the app with the network tab of your devtools open (filtering on XHR only, and “api” in the route) you will probably end up triggering the error, which displays as such in Firefox:
Sorry, I just saw your message. My previous message probably answers your first question!
As for the second one, yes I have access to the webserver’s logs but I (as well as the Heroku support) can testify that they are of little use since it seems pretty clear that these errors occur before reaching our server.
At least I can vouch that I managed to trigger the error before and could attest that nothing was outputed in the logs related to that call, not even the HTTP request.
The exemple I described occured to me this morning, within the first minute of browsing the frontend, and I stopped trying after getting this one.
But indeed, to indulge you I just went ballistic on the website again for the past 5 minutes, and could not get a single failing request over about 1000 calls…
I am wondering if there might be an element of time period involved, like it depends on the global throughput or some usage on cloudflare’s side…
I asked my users if they could notice that it seems to occur more around certain times than others, I will report on this tomorrow.
Indeed, this morning it apparently did not occur a single time (and I cannot trigger it either), and we have strictly not changed anything anywhere on our app since then.
So it does appear to vary on… something, which makes fixing this issue by trial and error fairly complicated…
I have no idea how to identify the source of the issue from our side.
Haha, that would be quite the coincidence, as I believe there have already been periods where the bug did not occur as much in the past 2 months.
Still we are trying out some stuff and our team will monitor for any resurgence in the following week, which may help us isolate some possible factor in this behavior.