I am trying to get Cloudflare Workers to cache static HTML pages. And it does work, the only problem is, as soon as I try it on another Computer it seems like there is is another separate cache for that pc.
So what I am trying to archive is that if one person hits a specific URL (e.g. https://example.com/link/1/2/3) this specific page is saved to the Cloudflare cache (For the next 3 minutes). And no matter who trys to open the specific URL it should be loaded from the Cloudflare cache for the next 3 minutes. After the 3 minutes are over, the page should be loaded from the origin and again be cached for 3 minutes.
My problem is, that every hit on the url produces pretty heavy database load, so no matter how many people try to load the specific url the server gets a maximum of 1 request per 3 minutes.
Here is my code:
async function handleRequest(event) {
let request = event.request
let cacheUrl = new URL(request.url)
//let cacheKey = new Request(cacheUrl, request)
let cacheKey = cacheUrl
let cache = caches.default
// Get this request from this zone's cache
let response = await cache.match(cacheKey)
if (!response) {
//if not in cache, grab it from the origin
response = await fetch(request)
// must use Response constructor to inherit all of response's fields
response = new Response(response.body, response)
// Cache API respects Cache-Control headers, so by setting max-age to 10
// the response will only live in cache for max of 10 seconds
response.headers.append('Cache-Control', 'public, max-age=180')
// store the fetched response as cacheKey
// use waitUntil so computational expensive tasks don't delay the response
event.waitUntil(cache.put(cacheKey, response.clone()))
}
try {
response = new Response(response.body, response)
response.headers.set('X-Debug-stack', JSON.stringify(cacheKey))
} catch (err) {
return new Response(err.stack || err)
}
return response
}
addEventListener('fetch', event => {
try {
let request = event.request
return event.respondWith(handleRequest(event))
} catch (e) {
return event.respondWith(new Response('Error thrown ' + e.message))
}
})
The problem is that the data should never be older than 3 minutes.
But every one specific url can get easily 10-100 hits/sec. And because of database performance reasons, I would like to cache it for 3 minutes.
And the minimum I can set the " Browser Cache Expiration" under the Caching tab in the Dashboard is 30 minutes which is way to long.
The computer do not use the same internet connection. But that would be the whole point of the workers script, so that everybody uses the same cache.
But the 180 seconds you specify in the headers is something you can also set on your origin. I am still not sure why you need to use workers for this case.
Anyhow, if you are not using the same Internet connection there is a chance the requests hit different datacentres and hence different caches.
ok thanks. So there is no way to “distribute” the cache between the datacenters so that there is a maximum of 1 hit per 3 minutes no matter what Cloudflare Datacenter answers the requests? I thought maybe thats what the Workers Cache API is doing.
The individual datacentres do not have a shared cached, but every resource is cached by-datacentre.
You could store everything in the KV database - as that is being shared - but I am not sure if you want to go that route. I believe you would need to look into Workers Sites in that case.
Thanks so much Sandro! So as Cloudflare has 194 datacenters the theoretical maximum is 194 hits per 3 minutes right?
I’ll take a look at worker sites. Thanks
Roughly yes, it depends what gets cached and for how long. For example Cloudflare does not necessarily cache for the specified timeframe. I am not sure you’ll get the exact three minutes everywhere, neither with the plain cache nor with KV. It would be rather an approximation, but if you want to store data globally KV (respectively AFAIK Workers Sites) might be your best bet.
This would be possible with tiered caching, which is the ability to use another Cloudflare data center as the upstream for a cache before going back to the origin. I don’t believe this is configurable to the level required here without an Enterprise plan though.
Thanks Zack,
so with Argo enabled for the site when I call “cache.put(cacheKey, response.clone())” in one CF datacenter, requests to the same URL in other CF datacenters would not reach out to my server but to the CF datacenter where the URL is already cached instead?
No, the Cache API only controls the local colo’s cache; it does not interact with Argo’s tiered cache feature.
But I don’t think you need to use the Cache API at all for this use case. If you just need to set the edge cache TTL, you can use the cacheTtl feature on fetch(), like so:
fetch()does read through the tiered cache, so that should reduce your origin load. However, as Zack pointed out, it’s probably not sufficiently configurable (without an Enterprise plan) to actually reduce your origin load all the way down to 1 request per 3 minutes per URL.