Can a Worker prevent full cache purges?

I have a moderate skill experience with Workers and struggle to find a high-level plan for the following.

My website is fully static and updates 3-4 times a week. With each update I purge the entire Cloudflare cache.

During each update I publish one post, which changes ~0.5% of the website. Yet I still have to empty the cache for the (in theory) 99.5% of other assets.

Is there a way around this with a Cloudflare Worker?

Other details:

  • My origin server returns the Last-Modified header.
  • The website updates are scheduled (so happen at known times and weekdays).

My goal is to achieve more website speed through a higher cache HIT %.

That goal rules out:

  • Having the worker check the Last-Modified header on the origin server, since this will take about as much time as a proper fetch (my website assets are small).
  • Using Workers KV (which adds geographical latency due to the centralised storage of less frequently requested assets).
  • Having the Worker empty the cache itself at those times and days when a website update happens (since this still indiscriminately empties 99.5% of the unchanged assets cache).

I don’t see a way around this without being precise with your purge. You would have to do a Purge By URL for only the pages affected, and that has to be determined at your end.

If you’re moderately skilled with Workers, then I bet you can handle working with the API. If it’s a fully static site, then I expect it’s file-based. You could write a script that looks for recently modified files and then feed those as URLs to the API.

1 Like

Yup CF Workers wouldn’t be the way to do this but use CF cache purge API to purge by URL. That’s how I do it with a custom script to purge by URL or purge ALL cache and be able to define the cache age threshold to purge and the specify which URLs I want to purge

./cf-purge.sh

Usage:
  where XX = purge age

  ./cf-purge.sh purge-url XX
  ./cf-purge.sh purge-url XX "https://yourdomain.com https://yourdomain.com/news"
  ./cf-purge.sh purge-all
  ./cf-purge.sh check XX
  ./cf-purge.sh check XX "https://yourdomain.com https://yourdomain.com/news"

For example to purge Wordpress sample page and hello-world post only when cache age is greater than 5 seconds

./cf-purge.sh purge-url 5 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"
build purge urls
https://domain.com/sample-page/ 2084
https://domain.com/1/hello-world/ 2446

purging:https://domain.com/sample-page/ age:2084
purging:https://domain.com/1/hello-world/ age:2446
purge status:true

to just check and print which of the defined URLs has a cache age greater than 5

./cf-purge.sh check 5 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"
build purge urls
https://domain.com/sample-page/ 1994
https://domain.com/1/hello-world/ 2357

I use cache age so I can purge cached assets on CF free plans which have a desired cache TTL below the minimum restriction the CF free, pro and business plans allows for which are 2hr, 1hr and 30 minute minimum respectively :slight_smile:

3 Likes

That Purge sounds like it looks at the Age header in the HTTP response and purges accordingly. I expect OP needs to check timestamps on local files and purge anything < x minutes/hours.

Yes you can use the local file’s modification time as a replacement for the cache age and reverse the criteria to less than XX time and build a url from the domain name and the local file url path.

I did something similar when I wanted to purge my Wordpress Cache Enabler’s full HTML page caches for both uncompressed and pre-gzip compressed versions selectively based on local cache file’s modification time using the stat command to get the file size and file age in epoch time

stat -c "%s %Y" filename
3371 1600794568

yes cf-purge is for me integrating Cloudflare cache purge into it too so purge-all will purge both local Cache Enabler HTML cached files + Cloudflare’s CDN cache purge :slight_smile:

./cache-enabler-purge.sh 

Usage:
./cache-enabler-purge.sh ce-purge {http|https} domain.com mins
./cache-enabler-purge.sh cf-purge {http|https} domain.com mins
./cache-enabler-purge.sh purge-all {http|https} domain.com mins

purge local cached HTML files older than 1 minute + re-cache via pre-fetching the url afterwards

./cache-enabler-purge.sh ce-purge http msdomain.com 1
------------------------------------------------------
file_path: /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/1/hello-world/http-index.html
file_name: http-index.html
file_modifcation_age: 2355
file_size: 17095
removing cache file: http-index.html
rm -f /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/1/hello-world/http-index.html
------------------------------------------------------
file_path: /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/1/hello-world/http-index.html.gz
file_name: http-index.html.gz
file_modifcation_age: 2355
file_size: 5705
removing cache file: http-index.html.gz
rm -f /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/1/hello-world/http-index.html.gz
------------------------------------------------------
file_path: /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/http-index.html
file_name: http-index.html
file_modifcation_age: 2356
file_size: 12253
removing cache file: http-index.html
rm -f /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/http-index.html
------------------------------------------------------
file_path: /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/http-index.html.gz
file_name: http-index.html.gz
file_modifcation_age: 2356
file_size: 4359
removing cache file: http-index.html.gz
rm -f /home/nginx/domains/msdomain.com/public/wp-content/cache/cache-enabler/msdomain.com/http-index.html.gz
------------------------------------------------------
200 - http://msdomain.com/ is now re-cached
200 - http://msdomain.com/1/hello-world/ is now re-cached

edit: added check and purging by last modified age

comparison of check cache age with check-modage last modified age in my script - so you can build the urls to be purged by either criteria :slight_smile:

./cf-purge.sh check 5 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"
build purge urls
https://domain.com/sample-page/ 5615 7830
https://domain.com/1/hello-world/ 5615 8161

./cf-purge.sh check-modage 8000 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/" 
build purge urls
https://domain.com/1/hello-world/ 5634 8180
./cf-purge.sh check 5 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"          
build purge urls
https://domain.com/sample-page/ 5954 8170
https://domain.com/1/hello-world/ 5954 8500

./cf-purge.sh purge-url-modage 8500 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"      
build purge urls
https://domain.com/1/hello-world/ 5975 8520

purging:https://domain.com/1/hello-world/ age:5975 lastmod:8520
purge status:true

./cf-purge.sh check 5 "https://domain.com/2/sample-page/ https://domain.com/1/hello-world/"
build purge urls
https://domain.com/sample-page/ 6222 8438
https://domain.com/1/hello-world/ 8 8769
2 Likes

With all the effort to improve cache hit rates, one thing to think about is whether it’s really that bad purging all CF cache assets ? At most ~210 visitors (1 from each CF datacenter) will be affected by a cache miss on first visit before the cache is re-populated :slight_smile:

Thanks for the input and great discussion! :+1::slightly_smiling_face:

That’s a good idea!

Do I understand that API doc correctly when I say that purging by URL is limited to 30 URLs at a time? Or does the ‘max length constraint’ apply to the payload send to the API?

Thanks for your examples and discussion, it already gave me ideas for my own script. :slightly_smiling_face:

True and I get your reasoning. Don’t put time in optimisation when it’s not needed, right?

But I also cache every HTML page on Cloudflare’s edge. My smallest website has 440 pages times 210 POPs requires 92k visitors (if I calculate correctly, seems too high to me).

Anyhow a lot of my visitors come from various continents so there’s also geographical latency I want to reduce by not purging the cache when not needed.

Plus if I spend 5 hours coding a script that only purges the essentials from the cache, and I can use that script for years to come on my two websites, then that’s not a bad use of my time. :slightly_smiling_face:

2 Likes

Believe it’s 30 files max per API call given CF 1094 error’s description

1094 Exceeded maximum amount of files that can be purged on a single request for your plan type.

+1

1 Like