Random 520 errors

My problem is similar to the long thread ending with Sometimes a CF 520 error - #170 by domjh. I have asked visitors to post Ray IDs and CDN information. Here’s a post reporting three of them from the same user:

https://www.home-barista.com/news/getting-http-520-error-code-t75325-40.html#p816285

Ray ID: 6897d75fbc325425
URL: Degassing whole bean coffees
CDN: fl=29f134
h=www.home-barista.com
ip=…
ts=1630765110.176
visit_scheme=https
uag=Mozilla/5.0 (Android 11; Mobile; rv:92.0) Gecko/92.0 Firefox/92.0
colo=YYZ
http=http/3
loc=CA
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off

Ray ID: 6897ea689de853ef and 6897ed535ab5548b
URL: Home-Barista.com - Login
CDN: Same as above.

It’s worth noting that in the ~week since these reports started, I personally have only seen it two times (ATL). I’ve checked the Apache logs and PHP logs; there’s no errors that I can find. This is one that just occurred

Ray ID: 689829211f94f305
URL: Getting HTTP 520 Error Code - News and Suggestion Box - Page 6

fl=27f316
h=www.home-barista.com
ip=…
ts=1630768794.62
visit_scheme=https
uag=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15
colo=ATL
http=http/2
loc=US
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off

Suggestions? I’ve added the Ray ID to the Apache logs and reviewed Community Tip - Fixing Error 520: Web server is returning an unknown error

Users have reported it happened more frequently with mobile phones; not sure if that’s relevant, but including these 520s from an iPhone. The two ray IDs below are not in the HTTP logs (I added the ray ID as suggested).

Another user mentioned that the 520 is nearly instantaneous, suggesting a blocked connection. I do have iptables setup to only allow connections from Cloudflare IPs from https://www.cloudflare.com/ips-v4:

ACCEPT     tcp  --  131.0.72.0/22        anywhere             multiport dports http,https
ACCEPT     tcp  --  172.64.0.0/13        anywhere             multiport dports http,https
ACCEPT     tcp  --  104.24.0.0/14        anywhere             multiport dports http,https
ACCEPT     tcp  --  104.16.0.0/13        anywhere             multiport dports http,https
ACCEPT     tcp  --  162.158.0.0/15       anywhere             multiport dports http,https
ACCEPT     tcp  --  198.41.128.0/17      anywhere             multiport dports http,https
ACCEPT     tcp  --  197.234.240.0/22     anywhere             multiport dports http,https
ACCEPT     tcp  --  188.114.96.0/20      anywhere             multiport dports http,https
ACCEPT     tcp  --  190.93.240.0/20      anywhere             multiport dports http,https
ACCEPT     tcp  --  108.162.192.0/18     anywhere             multiport dports http,https
ACCEPT     tcp  --  141.101.64.0/18      anywhere             multiport dports http,https
ACCEPT     tcp  --  103.31.4.0/22        anywhere             multiport dports http,https
ACCEPT     tcp  --  103.22.200.0/22      anywhere             multiport dports http,https
ACCEPT     tcp  --  103.21.244.0/22      anywhere             multiport dports http,https
ACCEPT     tcp  --  173.245.48.0/20      anywhere             multiport dports http,https

I will try removing the http/https blocking; I believe it’s correct above, but can’t hurt to verify.

Ray ID: 68983bf0ae4cd314 and 689840e639b263f6
URL: Getting HTTP 520 Error Code - News and Suggestion Box - Page 6

fl=27f148
h=www.home-barista.com
ip=…
ts=1630769306.485
visit_scheme=https
uag=Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.7 Mobile/15E148 DuckDuckGo/7 Safari/605.1.15
colo=ATL
http=http/2
loc=US
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off

Do you by have HTTP/2 enabled but maybe not properly configured? Could you try disabling HTTP/2 completely and seeing if the issue persists

Thanks for the reply @Walshy – I don’t (knowingly) have HTTP/2 enabled. To try to eliminate suspects, I started with a freshly provisioned server.

# httpd -v
Server version: Apache/2.4.6 (CentOS)
Server built:   Nov 16 2020 16:18:20

Apart from the stock httpd.conf configuration, I added <VirtualHost> and enabled SSL. Since I’m using the Cloudflare-supplied certificate, the SSL/TLS encryption mode is “Full”.

As part of my diagnostic efforts, I tried setting development mode ON. In the past, that’s help with odd random Cloudflare errors. It didn’t make any difference this time.

More information: The Cloudflare Diagnostic Center reports OK except "slow_ttfb_on_cache - The response was slower than 800ms and the requested resource is cacheable on Cloudflare." I’m not sure what to make of that as the overall site performance is very good.

The certificate is all right, but you should change to “Full Strict”.

Thanks @sandro – as I understand it, the “Full Strict” is when you’re using a non-Cloudflare certificate. I could do that, but I’m not aware of any functional difference, apart from the user seeing a different CA.

At one time I did have it setup that way so I could connect to the origin server without any browser warning, but since there’s preferences that override that and it’s not something I do much, I haven’t bothered getting a “real” certificate. I assume it wouldn’t impact the problem I’m seeing (?).

FWIW, in the past, I’ve also used the “Flexible” option when I didn’t want to bother with SSL setup.

I am afraid, that’s not accurate. Full Strict requires a valid certificate and in a Cloudflare context public certificates as well as Origin certificates are valid. So, yeah, change it to Full Strict.

It’s good you changed that because that was a bad choice which kept your site without encyption.

@sandro Ah, you’re right, I should have read more closely!

Full Strict: Encrypts end-to-end, but requires a trusted CA or Cloudflare Origin CA certificate on the server

Back to my problem, I added the ray ID to the page footer. That way, I can sanity check that my server-side logging is working. I confirmed that the ray ID shown on a given page appears in the HTTP logs. When I search using one of the reported 520 error ray IDs, it’s not in the log (e.g., as a status code 500 or 503).

As I understand it, that means the request is never reaching the web server; if it was code exception or database connection failure, I’d expect to see a 500 error logged. That makes it very hard to diagnose from my end. :thinking:

Error 520 server load looks much the same. Since posters on that thread asked about HTTP/2, I wanted to double check by running this on the origin server:

curl -I https://www.example.com -s | grep HTTP

HTTP/1.1 200 OK

As expected, it returns HTTP/2 200 when going through Cloudflare.

when i checked my HTTP its showing 301.

Sorry @muneef, I should have specified an example like curl -I https://www.example.com/ -s | grep HTTP. I’ve updated my post above accordingly. The command would be run on the origin server. If you run it elsewhere, it will be Cloudflare’s result (HTTP/2).

FWIW, you saw a 301 because the trailing slash wasn’t specified. I also have a .htaccess rule to block curl request agents, so technically you have to specify curl -A "some friendly agent" -I ....

After “pushing hard” on clicks with my mouse, in one of the requests (https://www.home-barista.com/espresso-machines/la-marzocco-linea-mini-mysterious-instructions-t75404.html) I got 520 returned.

May I ask can your web server and PHP handle all that?

fl=100f2
h=www.home-barista.com
ip=
ts=1630784139.531
visit_scheme=https
uag=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0
colo=MUC
http=http/2
loc=HR
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off

Sure, it’s three VPS servers (one for the database, two for web servers). I’ve had this configuration for several years; the load average is usually < 0.50 for the web servers and even less for the database server.

As a quick sanity check, I ran this curl script:

curl -A "Perftest" -svo /dev/null https://www.example.com/ "\nContent Type: %{content_type} \
\nHTTP Code: %{http_code} \
\nHTTP Connect:%{http_connect} \
\nNumber Connects: %{num_connects} \
\nNumber Redirects: %{num_redirects} \
\nRedirect URL: %{redirect_url} \
\nSize Download: %{size_download} \
\nSize Upload: %{size_upload} \
\nSSL Verify: %{ssl_verify_result} \
\nTime Handshake: %{time_appconnect} \
\nTime Connect: %{time_connect} \
\nName Lookup Time: %{time_namelookup} \
\nTime Pretransfer: %{time_pretransfer} \
\nTime Redirect: %{time_redirect} \
\nTime Start Transfer: %{time_starttransfer} \
\nTime Total: %{time_total} \
\nEffective URL: %{url_effective}\n" 2>&1

On the origin server, Time Total was around 0.3 sec; it’s about double that from the outside.

@erictung is right, the HTTP/2 question was about the origin server, not Cloudflare. You can’t disable the HTTP/2 feature on Cloudflare. https://blog.cloudflare.com/delivering-http-2-upload-speed-improvements/ provides the backstory, if you’re interested.

That said, I had the same thought and tried disabling HTTP/3 as you describe. I wouldn’t expect it to make a difference, but figured it couldn’t hurt to be more “vanilla”. It made no difference for me.

By the way, one of my site users reported that disabling the HTTPS Everywhere browser plug-in stopped the 520 errors. It’s a guess, but maybe its rewriting of URLs on the client lead to a monstrously long or malformed URL (?) that Cloudflare rejects. FWIW, I don’t use that plug-in and I’ve seen the 520 errors. :slightly_frowning_face:

All that does is disable Cloudflare. It does not fix the 520 error on Cloudflare. At least it stops that user from using insecure Flexible mode.

1 Like

It’s a desperate step – “gray clouding” your origin server. You’d be giving up Cloudflare’s (a) edge caching, (b) easy SSL setup, and (c) DDoS protection since your origin servers would be directly exposed. You’d still benefit from Cloudflare’s DNS management; if you have multiple origin servers, you’d still have crude load balancing via Cloudflare’s round-robin DNS.

For what it’s worth, what brought me to using Cloudflare in the first place was the performance hit from bots. Cloudflare’s edge caching takes load off the server and the DDoS service pushes back on malicious bots. It’s a non-trivial performance/admin advantage to give up.

Tip: You can challenge requests for “spam friendly” countries with an IP access rule (add them under Firewall > Tools).

Hi everyone,

Due to the number of 520s being reported in similar circumstances, we are escalating the issue to Cloudflare Support. The original incident should have been resolved and your issue may well be unrelated.

We recommend checking these troubleshooting tips, if you haven’t already.

If you have been through these thoroughly and are not seeing corresponding issues on your network/server and you have a ticket number with Cloudflare, please reply and post that #.

To enable efficient troubleshooting by support, please ensure you include the following on the ticket:

  • example URL(s) where you are seeing the error
  • Ray IDs from the 520 pages
  • output from a traceroute from any impacted user
  • output of example.com/cdn-cgi/trace - replace example.com with the affected domain.
  • Also include two HAR file(s) : one detailing your request with Cloudflare enabled on your website and the other with Cloudflare temporarily disabled - see How do I temporarily deactivate Cloudflare ?
2 Likes

@homebarista are you checking your Apache access logs here or your error logs?

If the connection is reset before the request is completed, it may well not appear in your access log. You can use the ErrorLogFormat directive to log the cf-ray HTTP request header from Cloudflare so you can then check to see if those HTTP 520 errors you see correlate with errors in your Apache error log.

Would you do this and let us know the result?

1 Like

I check both and see nothing out of the ordinary. The error logs are filled with lots of bot accesses like script not found or unable to stat: /var/www/cgi-bin/hgTrackUi, but nothing suggesting a server-side problem.

Thanks for the tip! I added this to the httpd.conf file:

  ErrorLogFormat "[%t] [%l] [pid %P] %F: %E: [client %a] %M %{CF-Ray}i"

I confirmed the ray ID is shown now at the end of the error log message. That said, during this whole 520 investigation, I’ve seen nothing in the access or error logs indicating a server-side problem. It’s my assumption Cloudflare isn’t connecting with the web server, based on the lack of logs and the near-instantaneous 520 error display.

Will do! I’ve seen three so far today and there’s nothing in the logs.

Edited to add the ticket number, for reference: 2247947

Thanks for this - during the day I’ve managed to replicate something that I think may be related - would you try disabling keepalive as an experiment for a while?

This is not a good practise long term for performance, but we think some origins may be handling KeepAlive in a way that means we encounter an error when the keepalive has expired. We already have a possible fix being worked on, but would like to know that your issue is related to this.

In Apache I believe this is done via KeepAlive Off directive.

1 Like