When I request an object using the domain name for an object that doesn’t exist, it returns a 404 as expected. However it is also returning a cache-control: max-age=86400 header (cache for 1 day).
Interestingly I just set up a new bucket against another domain just to check to see whether it behaved the same.
For this second bucket, if I try to access an object path that doesn’t exist but it has an extension that would usually be cached (eg .jpg, .png etc) it returns a header cache-control: max-age=14400 which is 4 hours, so that’s shorter. I’m not sure at this point what is influencing the cache-control header length difference between the two buckets/domains.
So it looks like there is a difference in behaviour from one of my domains to the other, but it still isn’t a particularly short cache for a 404 response.
You are mixing two caches here. The five minutes apply to the proxy cache and that’s how long it will be cached on the proxies. What you are referring to is the browser cache and that’s unrelated.
Yes, you would need to configure this on the origin - or you could override it with a page or cache rule.
According to this page https://developers.cloudflare.com/cache/about/cache-control/ it says that the cache-control header is used to tell Cloudflare how to handle caching.
Set Cache-Control headers to tell Cloudflare how to handle content from the origin.
I understand that the browser will use the cache-control header to decide whether and how long to cache an item, but I am under the impression that the proxy cache also adheres to the value that comes back from the origin to decide how long to cache a response for, is that incorrect? Does the proxy cache only cache 404s for 5 minutes no matter what?
So with that in mind, I think that there is a problem here because I don’t think that there is a way to control the cache-control header for an object that doesn’t exist in the R2 bucket, without applying a blanket rule that overrides the cache-control header for all requests to my connected bucket’s domain - which for a highly static and cachable object bucket would be undesirable to say the least.
I wouldn’t want to set a cache-control response header of 5 minutes just to satisfy 404 requests…
Sorry if I’m misunderstanding, thanks for your help.
Just in case it’s interesting/useful to any other travellers around these parts - I have at least been able to confirm that if I put an object into R2 with a CacheControl property that this is returned in the header for objects that exist at least.
But as far as I can see it’s not possible to control this header for objects that don’t exist. Unless it’s possible to put metadata onto the “directory” level in R2…?
So essentially your question is how to control the caching for non-existing items in R2, right? @sdayman may have an answer to that, he is our residential caching guy
Just one more comment from my non-R2 side if nothing else works, you could use a Worker, where you specifically set the caching time for 404 requests, but of course that would be paid beyond 100,000 requests, so a native way would be more elegant.
Yep. I already tried the progressive transfer from s3 to r2 last month and I rinsed through over 8 million worker requests after deploying it on and off for brief periods over the course of 4 days. That’s completely unsustainable for me to serve static assets. Cloudflare really shines when it comes to caching static assets but it seems like this may be set up incorrectly to return a header for 404s - I’m aware that r2 is a beta product so it comes with the territory I suppose.
According to Cloudflare support, it looks like items that are not found in R2 return the cache-control TTL that is configurable under Caching > Configuration from within my account.
As mentioned by @sandro I was mixing up two types of caching here. Just because Cloudflare returns a HIT on a 404 for my not-found resources and a cache-control header with a TTL that is longer than few minutes like I expected, it looks like the Cloudflare edge cache will only keep hold of that 404 for a limited time before it is dropped out. You’d need the enterprise to set a different cache-control header that deviates based on the HTTP status code.
The cache-control header comes from the CacheControl setting for the object within R2, which I’m assuming that Cloudflare will honor to some extent when caching resources. This can be set for an existing folder and existing objects within R2 using the following AWS SDK command:
That’ll do a server side copy of your resources in R2 and update the CacheControl property. Or you can set it when you put the object into the bucket in the first place - see nodejs code in my previous post.
Thanks everyone, including Cloudflare support for help.