Hi, we have a site with a large amount of traffic.
Our URLs serve different content for different countries, the URL won’t change, though.
At the moment, we are using Workers to intercept the request and append the country as a query param, and also a token that we will store in KV as a cache busting mechanism.
We store the token using the URL path in KV: /page-1 => 111-222-333
/page-1, becomes /page-1?c=AU&t=111-222-333
When requesting the page next time we check if there’s a token for the given URL with country in KV and we append it, then it will be served by CF edge cache.
To purge cache we delete the token from KV and a new one will be generated next time.
In the end the HTML caching is performed by CF Edge caching as we set it to cache everything.
During DDOS attacks we realised CF edge cache had a limit, maybe is not so resilient or we reached a quota, not sure about it.
My question is, as we are already accessing KV to fetch the token, would it be better to store the HTML directly in KV, where the key is the URL and the value is the HTML? What’s faster and more resilient?
I don’t have anything to add about your question as to what is faster or more resilient. But, I’ve been thinking about building a similar mechanism for a while. Have you seen BetterKV | Flareutils, which writes to both KV and Cache, which allows you to only read from KV when the cache is empty?
This article is also insightful How Kinsta used Workers and Workers KV to improve cache hit rates by 56% (cloudflare.com)
Right now, I’m leaning towards avoiding KV for caching as much as possible primarily because of limitations for querying/searching KV, no bulk read api, as well as just the costs associated with KV. I’m probably going to use CF workers to query a backend database as-needed to generate/update what then gets stored in the Cache. Client browsers will then read directly from the Cache as much as possible - probably through a Worker which can do some other things like logging, html rewriting, retrieving required cache keys etc…
Is this helpful at all? I’m happy to continue brainstorming about this if you like
What did you see that made you think that edge cache had a limit?
That’s interesting. What alternatives are you considering?
During the DDOS attacks, many requests came through CF cache to the origin. I can imagine paying a small fee to CF doesn’t mean I can be entitled to millions of requests, at some point the will let it pass.
As I described, I’m leaning towards just storing kv type stuff in my origin server - it could be in anything really, be it mysql redis, elasticsearch etc. And then just generate the html as-needed and store in Cloudflare Cache with unique cache keys, as Kinsta seems to do from that article.
There’s no cache limit. Requests will often get through if the attack is distributed and hitting locations where resources are not cached. They may also add query strings, which get around most people’s caching.