As I understand it, caches.default not a global object, but instead is only stored per colo. If I can’t purge caches.default using the standard “custom purge” workflow, is there any other way to purge caches.default for all locations?
I can’t prove this, but I’m very nearly certain that purge-by-URL used to work for the Cache API.
The fact that it doesn’t now, breaks our intended workflow – we were using a two-tiered caching system where we’d first check whether a page was present in caches.default, and if not, back off to a KV-based cache, and if not present there, THEN we’d hit the origin.
However, we need to regularly purge individual pages, and if we can’t globally purge things in the Cache API, then we can’t use it at all for this purpose, since there’s now no way to know whether something in the Cache API is supposed to have been invalidated. The alternative is to use fetch(), which respects the purges that have been done, but we’d want it to only fetch if it was present in the cache, and not immediately go to the origin otherwise. AFAIK there is no way to do this, so we can’t insert the KV caching as a middle layer between global cache and origin calls.
This puts us in the unfortunate position of having to decide between either the KV-based cache, or the fetch()-based cache, when we used to be able to use both.
Hi @weirdgloop, I believe you’re correct about purge-by-URL previously working for the Cache API. The change in behavior likely has the same root cause as the issue described in this thread. However, I can’t guarantee that whatever fix is implemented will fix both cases.
I am curious if you have evaluated the performance difference between using a purely KV-based cache, versus the 2-tier Cache API/KV based system you have? I would expect that the KV-based cache would perform almost as well – specifically, I’d expect that a KV-only cache would have slightly more overhead on the happy path and significantly less overhead on the sad path (because of the removal of one round-trip to the cache disks).
Thanks for the info, @harris – do you know if that related issue has been resolved?
For our case at least, the 2-tier cache was significantly better (~140ms average page load vs ~270ms with only KV). I think this is because a large enough percentage of our KV pairs don’t make it to the POPs due to popularity, and have to be read from wherever they are stored globally.
Hi @weirdgloop, indeed a fix was released for the related issue.
Thanks for sharing the latency figures – that’s a significant difference, more than I expected. I think the reason is that I overestimated how long KV itself caches values for. It turns out it only caches values for a few seconds, so you are correct the latency to the central storage is the cause.
Out of curiosity, how important is it to avoid hitting the origin? I ask because of all the possible strategies, the simplest and most performant would be to just use something like fetch(url, { cf: { cacheTtl: <ttl> } }). But if avoiding the origin is paramount for some reason, that of course won’t work.
Thanks! I’ve confirmed this fixed our problem as well.
Avoiding origin hits isn’t essential, but it’s a huge step down in latency from either cloudflare caching or even the KV central storage. Regular CF caching with a cacheTtl is definitely best when the assets don’t get evicted, but because CF has so many POPs (and our assets are fairly long tail), we end up with a not-great cache hit rate per POP. By having a “central” store by using KV, we significantly increase the hit rate, since one hit from anywhere in the world will put it into our central KV cache for everyone.