403 Caching


#1

Hi, I’ve successfully used the cache everything page rule to cache static html using cache control headers set by my origin server.

But I’m having trouble with 403 responses. Is there a way to avoid caching these? I spoke to my hosting provider and they seem unable to add a no-cache header to my 403 pages, so i’m not sure what i can do?

At present, the issue i have is that if a 403 is generated on a post url, then subsequent visits to that url will see the 403 error page and not the article itself. So what i’m looking for is a way to exclude caching all error pages.

Any help would be greatly appreciated! Thanks!


#2

Hey @marty,

The only way to do this would be with Workers (https://www.cloudflare.com/products/cloudflare-workers/) or with the custom EDGE code we provide in the Enterprise plan.

Indeed, as long as you’re caching everything, we’ll stick to the cache-control header we’re receiving from your Origin and if something is set for the 403, we’ll cache it accordingly.


#3

Hi @stephane

The cache everything page rule is a really useful feature to have. It’s novice friendly and can easily be activated. I noticed it reduced my page load time by about half when caching the html page.

With this in mind, is there any reason why Cloudflare doesn’t offer a simple way to exclude error pages from being cached if the user is unable to set a cache header? I know you said there’s an option in the Enterprise plan, but not everyone can afford that.

Is is possible to suggest this as a new feature? As having the ability to exclude error pages from caching would make a tremendous difference to people who use the cache everything page rule to speed up their website.


#4

What’s the cache-control configuration for the error pages sent by your hosting provider?


#5

Unfortunately, there isn’t a cache header for 403 pages. It isn’t specified. So when enabling “cache everything”, my 403 pages get cached.

My host is Siteground. I contacted them about this but they were unable to help.


#6

A cache everything with no cache-control will not cache your asset as long as you’ve not a EDGE TTL specifically set in a page rule triggering with your traffic. So the error pages won’t be cached.

The plan is to set cache-control at your application level, leave no cache-control at the host level (who send the errors pages) and apply just a cache everything, we should only cache what is set to be cached through your cache-control policy.


#7

@stephane

I currently have “cache everything” page rule disabled, but when i had it enabled this was my configuration:

Page rule - Cache Everything (www.mydomain.com/*)

That was the only page rule i configured. The EDGE TTL page rule was not used.

I had also configured Caching > Browser Cache Expiration > “Respect Existing Headers”

Using this configuration, if a page received a 403 response, and then i visited that page again, the 403 page would be cached as shown by CF-Cache-Status=HIT

Response Headers

This is the response header from the 403 page given out by Siteground. Cache everything page rule is not enabled. I have no control over these response headers. Siteground support were unable to add any cache headers.

403%20response%20header

Cache Everything Enabled

If i enable the “cache everything” page rule, this is the response header i get for a repeated view of a 403 page. EDGE TTL page rule is not used.

403%20cache%20hit

Good News! :slight_smile:

stephane, as i was putting this all together to explain the issue clearly to you i made a discovery that solved the problem.

I added the page rule “Origin Cache Control” and set this to on.
The only other page rule i had enabled was “cache everything”

When i visited the 403 page again, cache is now being bypassed! Success!

Strangely, there is no no-cache header present. So i’m not sure what’s making it to be bypassed, but it is no longer being cached.

403%20cache%20bypass

410/404

My 410/404 pages output a no-cache header, so Cloudflare does not cache this which is what would be expected. But again, why the cache is being bypassed on the 403 page when it doesn’t have a no-cache header, and when “Origin Cache Control” page rule is enabled, i don’t know?

410%20no%20cache

Could you please confirm for me that with my current settings Cloudflare will not cache error pages? I am unsure why the 403 is being bypassed without having a no-cache header. Do you know why?


#8

You shouldn’t rely only looking at your Firefox inspector, the cache-control you see from your side may be different than the one we receive from your Origin. (We respect two cache level, EDGE and Browser).

To be sure you don’t have any cache-control at your origin for error pages, you should direct your request to your direct Origin, bypassing Cloudflare.
A cache everything without cache-control and no EDGE cache TTL in a page rule = no cache.

With the “Origin cache-control” we extend the cache to strictly respect your cache-control configuration no matter what’s the extension: https://support.cloudflare.com/hc/en-us/articles/200172516-Which-file-extensions-does-Cloudflare-cache-for-static-content-

Without the name of your domain and a specific URL, I won’t be able to confirm you if whether or not we’ll cache your pages.


#9

A 403 will be cached for up to 5 minutes. If your host is routinely generating 403 forbidden errors, uh …I can’t think of why that would happen intermittently.


#10

@stephane

Please view this url for a 403 page. “Cache Everything” page rule is currently disabled.

https://www.celebrityhealthcritic.com/wp-content/

@cscharff

403 will be generated for security reasons, e.g., restricting access to bad bots in htaccess


#11

After pausing Cloudflare I get the following response header. I don’t know what else i can do except checking in Firefox? But it still doesn’t show a cache header, yet enabling “origin cache control” causes the cache to be bypassed for some reason. Maybe there is a hidden header here i’m not seeing?

403 With Cloudflare paused

403%20cloudflare%20paused


#12

Now you have the domain and the picture of the response header from direct connection to origin, bypassing Cloudflare, are you able to confirm this? Thanks


#13

Hey,

So for this URL (https://www.celebrityhealthcritic.com/wp-content/), no cache-control header is sent to us so we don’t cache it by default, that’s why you don’t see the cache-status: HIT.

This will be the same with the activation of the “Origin Cache-control” page rule since we’re explicitly looking at this header to take the decision of whether or not to cache the Asset.

For having this page cached, you’ll need either:

  • Cache Everything + specific EDGE TTL page rule
  • Origin Cache control page rule and see with your provider to set a cache-control header we can respect

#14

@stephane

Thank you for your help and clarification, much appreciated!


#15

This topic was automatically closed after 14 days. New replies are no longer allowed.