Intermittent 526 errors with custom hostnames

What is the name of the domain?

ensinoagil.com.br

What is the error number?

526

What is the error message?

Invalid SSL certificate

What is the issue you’re encountering

CF will sometimes reject the certificate (SSL alert number 42)

What steps have you taken to resolve the issue?

I tried using a new origin certificate

What are the steps to reproduce the issue?

Please see the screenshot for the details, I’m sorry about it, but the forum form keeps saying I’m trying to spam links.

Screenshot of the error

Hello,

What is the error number?
526

Error 526 occurs when these two conditions are true:

  1. Cloudflare cannot validate the SSL certificate at your origin web server, and
  2. Full SSL (Strict) SSL is set in the Overview tab of your Cloudflare SSL/TLS app.

For a potential quick fix, set SSL to Full instead of Full (strict) in the Overview tab of your Cloudflare SSL/TLS app for the domain.

Kindly review this document for further information : Troubleshooting Cloudflare 5XX errors · Cloudflare Support docs

Thank you

This will disable encryption and is highly discouraged.

The encryption mode should always be Full Strict, otherwise the site is at risk. And the OP actually already pointed that correctly out.

2 Likes

Exactly. Also note that the error is intermittent. That’s the whole point. Disabling the strict mode is, at best, a hack.

I’m an experienced CF user. I’m quite sure this is a bug with Cloudflare and not a mistake on my end.

Is there any chance we could get an engineer to take a look at this?

1 Like

I understand your sentiment and I did read that you are not brand-new to the platform :slight_smile: - still, the certificate validation is one of the things that actually usually works at Cloudflare.

Are you sure that it’s not your server that might be intermittently presenting an invalid certificate? The custom hostname should actually not really be at play in the context of custom hostnames, but it should rather be a mismatch between the certificate and the configured maindomain.com hostnames.

Also, the 521 would point towards a completely different issue of the server not being reachable at all.

Do you have the possibility to extend your logging to also log which certificate was presented when your servers logs the SSL issue?

Hi, sorry for the late reply.

I’m quite sure Nginx is not intermittently presenting an invalid certificate. There’s only one vhost and it is configured to always use that certificate. I mean, it would be unheard of that Nginx would just bug out like that.

Please disregard the error 521 thing, that actually turned out to be something else.

I’m not sure what kind of extended logging could be done, if you have suggestions please let me know.

I shall insist on my theory that CF is slowly rolling out a bogus update to their servers. The error rate is already at 5% and keeps going up every day.

Naturally, I cannot rule it out, but I really doubt that this is an issue on Cloudflare’s side. There are intermittent hiccups on Cloudflare’s side, but certificate validation so far never was one of them.

If Nginx is the front-server, I would try to enable some sort of SSL debug logging to see which certificate is served for each request and then use that log whenever you run into a 526. Please refer to the Nginx community for details on what is available here and how to enable it.

Another thing you can try is to simplify your CNAME chain. Considering you use CNAME entries, maindomain.com should actually work just fine for your certificates and the custom hostname should only be relevant for proxy certificate. But maybe there is somewhere a glitch and you can simplify the setup.

As mentioned, I certainly cannot rule out an issue with the certificate validation, but I should be surprised if this really was an issue on Cloudflare’s side, because that usually work and the issue here should not be the custom hostname, but rather a mismatch between server certificate and hostname, but with the CNAMEs in place, said domain should actually be accepted just fine.

Hi, I have new evidence that this is not a problem on my end. I have decided to investigate the problem with Wireshark. Here are the results.

First of all, I forced Nginx to use only TLS 1.2 and the AES256-GCM-SHA384 cipher. This is because TLS 1.3 encrypts the handshake, making the problem harder to diagnose. This simple cipher also excludes the possibility of a ServerKeyExchange message, since we are using RSA key exchange.

The exchange is as follows:

  • CF sends a ClientHello with SNI indicating the hostname.
  • The server responds with a ServerHello, Certificate, ServerHelloDone.
  • Whether the connection is successful or not, the certificate is always the same. I triple checked this.
  • Most of the time, the connection is successful.
  • When the connection fails, the server sends a TLSv1.2 Record Layer: Alert (Level: Fatal, Description: Bad Certificate) and drops the connection (FIN, ACK).

Up next, I ran the capture for ~15 minutes. Then I checked, for each instance of the problem, the hostname indicated in the SNI field of the ClientHello. It’s always the case that it’s a custom hostname. Requests to a subdomain of the main domain (which are explicitly covered by the certificate) never fail. Requests to custom hostnames intermittently fail with the bad certificate error.

Stats of the capture:
ClientHello: 2585 packets
Alert (Bad Certificate): 30 packets

But wouldn’t the following suggest, it actually is?

The server seems to drop the connection here and refers to a bad certificate, which is not coming from Cloudflare, but from your server.

As it is to be expected. The whole request is for that particular hostname, the main domain is only used for the CNAME entries (and certificates for this domain will also be accepted).

IMHO, the error would indicate that Nginx has some issues handling the certificate. Just a stab in the dark, but could it be that you need to configure the whole certificate chain and Nginx currently cannot verify that and hence aborts the connection?

1 Like

Alternatively, you don’t have somewhere in your configuration client certificate authentication configured, right?

My bad. I meant the client. It’s CF that sends the alert and drops the connection, not Nginx.

I do not have client certificate authentication configured.

I have also discovered something new. If you proxy the custom hostname through CF (thus flattening the CNAME chain - such that a DNS lookup returns a simple A record to a CF node) the error goes away. If you do not proxy it (such that the DNS resolves a CNAME record to my main domain), the error happens.

In other words, O2O (orange-to-orange) works perfectly.

This does not lead to a fix, though, because most of our SaaS clients (who bring their own hostname) do not use Cloudflare and have to rely on the CNAME record (in fact this is standard procedure and recommended by the CF SaaS docs).

and certificates for this domain will also be accepted

Yes, that should be the case 100% of the time, but it’s only 99%.

Do you use CNAMEs for subdomains or for the apex?

Can you give any actual examples?

Yes, please see the attached image (it says I can’t post links)

You can use the preformatted text option (ctrl+e) to prevent text from being converted to links.

Could you share the IP address of your origin (via private message if you don’t want it public)?

Do all of your custom hostnames use the same fallback origin, or do you have multiple origins for custom hostnames (Enterprise feature)?

Also, can you set the test domain you shared to DNS-Only or do you need it to work reliably right now?

1 Like

Sure, I’ll PM you the IP. All custom hostnames use the same fallback origin.

I have also set that subdomain to DNS only for testing.

Fair enough, but that seem to be suggested by the 526 anyhow. Did you verify the server actually sent an acceptable certificate? Did you log which certificate was used for that request? If not, we still can only assume it was the right one.

IMHO, this is where the issue will be. I may be wrong of course :slight_smile:

And you managed to verify this consistently? I am asking, as this might be a hint for Cloudflare as to what is happening, however I’d still somewhat doubt that to be the issue, as the ultimate IP address will be in either case a Cloudflare proxy, so orange-to-orange should not matter much here.

Did you verify the server actually sent an acceptable certificate?

I did. It’s always the same certificate, which is accepted 99% of the time.

And you managed to verify this consistently?

Yes, it’s consistent behavior.

Fair enough then. Even though I still somewhat doubt that it’s a verification issue (this simply really usually works, contrary to other things), I understand you have performed the necessary steps to verify that it is not a server configuration issue. At this point, only someone with access to Cloudflare’s infrastructure could provide more insight I am afraid.

Just to clarify, you checked this server-side, not just client-side, correct? I am asking, as there was recently a similar validation issue, where the server actually kept sending valid certificates to every client but Cloudflare, which also threw a 526 of course. That was impossible to debug from a non-Cloudflare perspective.

Would you have any idea how to draw their attention to this issue?

Yes, the Wireshark capture I mentioned earlier refers to CF IPs, e.g. 162.158.159.184. The certificate was correctly sent and 162.158.159.184 dropped the connection with alert number 42.

That’s the tricky part :smile:

Sorry for asking again, but you verified this for this particular connection, that it really is the right certificate, right? Not just assuming by the apparent configuration for this virtual hostname?