How long Cloudflare keeps cached files?

I was trying to understand how often Cloudflare gets rid of the cached HTML files in their nodes.
I was reading some information about it here and here but I don’t have it yet very clear.

They say that Cloudflare will respect your origin expires / cache control headers to calculate the Edge Cache TTL.

But how do I specify on my server the expires / cache control headers ?
Is it this on Apache’s htaccess file?

<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType image/jpg "access 1 hour"
ExpiresByType image/jpeg "access 1 hour"
ExpiresByType image/gif "access 1 hour"
ExpiresByType image/png "access 1 hour"
ExpiresByType text/css "access 1 hour"
ExpiresByType text/html "access 1 hour"
ExpiresByType application/pdf "access 1 week"
ExpiresByType text/x-javascript "access 1 hour"
ExpiresByType application/x-shockwave-flash "access 1 week"
ExpiresByType image/x-icon "access 1 hour"
ExpiresDefault "access 1 week"

Should I use a bigger value if I do not expect those values to change in the short term?
Something like 8 days? So my clients won’t have to request data to my server for 8 days and instead get it from Cloudflare’s cache?

But… will Cloudflare respect those 8 days of Cache and keep a copy of those pages in their nodes for 8 days? (Sounds like a lot of space for them for their millions of sites.

Bump? Any answer?

Let me see what I can find about this.

Could anyone provide an update on this please?
Thank you.

The general rule on this is Cloudflare will keep files in cache for up to the length of time you specify. But the caveat is if it’s a seldom-used resource, it will be purged sooner than that. For return visitors, their browser should honor the cache setting and hold onto those resources longer than Cloudflare’s edge cache will.

Let’s see if @ryan heard back on this.


Thank you for your answer. That’s indeed how a caching proxy should behave but since Cloudflare has a free service I wonder if they really keep even assets with cache-control max-age headers set to one year for that period.

Edge Cache maxes out at 1 month. Again, same rules apply…resources are purged if not used frequently.

Seems reasonable. Thank you @sdayman I apreciate it.

1 Like

Can you be more specific about the seldom used rules? Do you purge caches if they’re not used for a period? How long is that period?

Sorry, no. There’s no set rule, as conditions are constantly changing. If you’re looking for cache permanency, I suggest you check out dedicated CDN companies.

As others have pointed, Cloudflare will attempt to respect the time for which you want to store files. However, as far as I know, in some scenarios, Cloudflare might stop caching those files that are barely requested because they occupy important storage that more requested files can take.

I doubt you will receive a discrete answer to that, the CDN decides which files should be cached or not based on many factors that are unlikely public, the thumb rule is that if your file is requested frequently it will be cached.
I take a blind shot and guess that; different plans might have different priorities, if a node is running short of storage, it makes sense to first remove files from the lower tier plans.

Does anyone have any idea roughly what “frequently” might mean? Cos the definition could vary hugely.

Are we talking every second, minute, day or week?

For reference, nginx proxy-cache will delete caches after 10 minutes of inactivity by default. I’m hoping it’s longer than that.

If you do the maths, it doesn’t really matter. The key is frequently visited files are cached. If a file that is accessed once every 2nd week, a month or a year is not in cache, what’s the cost to the web site?

In theory you could sort of test the frequency via CF cache analytics and setup separate private files of different content types to only be accessed on infrequent schedules like once a week, month, 2 months, 3 months, 6 months and 1 year and they query the HTTP response headers at start and the next access and check the age, date and cf-cache-status headers for their values. The age header is how long a file is in CF cache for.

1 Like

I believe the discussion needs to be properly framed, as Cloudflare’s view tends to be opposite of what most people believe. Most people want a file to never be requested from their server, hence, setting long cache TTL. But in reality, that implies a mostly idle server which should serve files quickly.

A more realistic view of Cloudflare’s cache is to offload a hard-hit server. The TTL really isn’t about how long to not have that file pulled from your server, but more about how long is it safe to serve that version of the file?

You may notice that TTL limits on higher plans aren’t longer. They’re shorter. Because an enterprise site may have frequently changing content, but they don’t want their servers constantly hammered by requests for the same file until that file changes.

Here’s a more technical article that better describes their philosophy:

1 Like

I agree, CF’s biggest architectural complaint from potential customers is, CF “is not a CDN”, since “all other CDNs” on earth are a “push”-like FTP server, upload the file and delete it from your metal/your disk, it will stay in “the cloud” on the CDN forever until youexplicitly deleted. Other potential customers complain “Cache-Control” on origin server headers is 1 hour, my dynamic asset is SO expensive ($/cpu) to compute, if its fetched more than once an hour from origin (5-20 CF POPs fetch it every hour), I am bankrupt. Cloudflare is not for them.

Perhaps there are some CDNs like this, but I’m not familiar with any.

No, most traditional CDNs require a unique URI for static images, Cloudflare is an inline / transparent CDN. I’m not aware of a CDN that will store data until explicitly deleted, but that sounds like a fun idea.

[quote=“cscharff, post:17, topic:10282, full:true”]I’m not aware of a CDN that will store data until explicitly deleted, but that sounds like a fun idea.
The feature exists.

Amazon Cloudfront might be cheat mode as Amazon came from a CPU/SQL/OS heavy VPS/Kuber/Docker/dropbox platform first, then added “want your disk files stored outside of spinning rust in Ashburn? pay us more” as a value added feature.

Cloudflare Pages or any kind of “git push” CDN would be an synonym for “upload and delete from origin” since FTP is unpopular as a protocol nowadays regardless of encryption.

Correct as far as I see this. ReverseProxy CDNs are “new” and there are just very few reverseProxy CNDs (Pull CDNs) under the most professional CDNs.

This my friend is since some years (I honestly am not kidding) my dream. But Cloudflare will not be able to provide such functions and no Pull CDN actually should, as the controll of which files are cached at which pops is initiated by the visitors and not from the application.

It actually sounds perfect in my ears as the automatic Cacheflush does not make sense if you dont know if actually changes at this URL have beeing made.
But Cloudflare must do this due to the fact it offers this service for free for most users. So keeping all the cache guaranteed online will cost them ressources which noone pays for as most people are using it as a free service.

Well CDN services which are working with FTP(S) protocol do have their own problems and actually are not very good controllable when it comes to Apllication-Based control. But yes all of them are functioning like this.

I since some years have already thought about a real solution that would entirely solve the problem of many (if not all) people here in the community about bad performing and not consistent performing sites. Mostly its not because Cloudflare but the more I think about this the more confident I’am that these thing could still be solved on CloudFlares side.

And maybe one day I will try to introduce this idea to Cloudflare if there even is a chance of getting an audience (in real life) at people which would have the power to deside if this would be a good addition to what Cloudflare already offers.

So Key CDN let’s you upload static assets to storage where it can then be pulled from a POP when a user requests it? And they charge you to store it? Plus they charge you for bandwidth? Typically if my customers are looking to store content I point them to Backblaze, Tencent or one of the other bandwidth alliance partners where they can store their content. Between the egress cost savings, freedom to use that content with any provider and their focus on delivering low-cost reliable storage it seems like a great option.

Those don’t really seem to be appropriate for dynamic assets where CPU is required to compute them though. Especially since the doc you linked to says 15 minutes to update an asset with KeyCDN. With a 1 hour TTL sounds like content that would change often, otherwise wouldn’t you just set a longer TTL at the origin?