Can you use KV to improve cache hit rates?

I’m just getting started with Workers and I’ve been pondering how I could use it to improve cache hit rates beyond what I’d get with Cache Everything, Argo Tiered Caching, and far future Cache-Control headers, i.e. max-age= 31536000. I’ve already seen some discussion about using KV for caching including the Edge Cache HTML example, but I wanted to validate a few things.

Are my assumptions here valid?

In a pre-Worker world, I’ve had a few issues:

  1. It’s unclear how long stuff stays in the Cloudflare cache generally speaking; if I ask Cloudflare to cache for a year, I doubt it will happen always, but I haven’t tried to measure
  2. Even with Argo Tiered Caching, the cache hit rate doesn’t seem as high as it could be
  3. Without Enterprise, there’s no way to partially clear the cache e.g. with tags

Solutions and questions:

  1. With KV, I control how long stuff is cached, so eviction shouldn’t be an issue
  2. Using KV as a cache should give me nearly 100% hit rate on subsequent requests
  3. With KV, could I cache items in namespaces to simulate tags? Would it work to delete an entire namespace, and recreate it to delete everything in it - effectively resetting my KV namespace used as a cache?

A. Besides cost and the 2 MB limit for KV, is there any downside to using it for caching instead of the Cache API?
B. If I was using the Cache API, and I have Argo Routing on, would I benefit from tiered caching, or does that only work with fetch?

Ultimately, I’m trying to weigh if I should be using fetch, the Cache API, KV, or some combination of these for the best performance. I’ve read the documentation around these already, but wanted to see if there’s anything I’m missing.

Thanks!

If you’re looking to improve cache hit rates, ensure the content you want to cache is static, and not changing every day, week, or month. In your web server configuration, you can set the headers to Cache- Control: public, max-age=x if you want CloudFlare to cache. Although, I’ve noticed Cache-Control: private will also be cached by CloudFlare. Once you’ve made your web server cache, you can set a page rule to the files you are trying to cache with mydomain.com/*.extension where extension is the file extension (html, php, css, jpg, png, and more). You can send the cache time to whatever you wish it to be, and then set the page rule.

I’m not fully qualified to answer this as I have no clue what “KV” is, but if the tips I listed above help, please let me know. I will do more research and then get back to this if nobody else does.

I should have mentioned this and have now updated the first post, but I’m trying to see how I can improve hit rates beyond just using Cache Everything, Argo Tiered Caching, and far future Cache-Control headers, i.e. max-age= 31536000. It seems like a decent amount of traffic still makes it back to the origin more than once, even if it could have been cached. I know there are many reasons for this, but was trying to figure out if I could do better, or if it’d even be worth it.

KV is Workers KV (key / value store). Since it’s persistent, stuff won’t fall out of cache. However, without benchmarking, it’s not clear that would even be better performance-wise than going to my origin, which is usually within ~100 ms for my users, but then I pay more in bandwidth. KV costs extra money too though.

Since posted, I’ve also learned from Cloudflare support that using the Cache API doesn’t take advantage of Tiered Caching with Argo, so while I might be able to use some tricks to optimize caching better with it, it seems like my hit rates would suffer since every PoP would be making requests back to the origin - at least with the Cache API alone. It also seems like due to the way the Cache API works, it might add some latency, since you have to check to see if something exists, then fetch it from somewhere if it didn’t already, instead of doing that all in one go with fetch. It’s hard to measure time in Workers, so I’m not sure, but maybe it doesn’t matter since it’s all local and probably pretty fast.

I’ve not used any other form of caching besides the regular cloudflare cdn caching and receive 100% cached requests. All of my requests are cached through cloudflare. If your server is still receiving requests, it could be normal. CloudFlare will display the visitors IP on the web server logs anyways, even if the resource is cached. It could be because the resource that is being request cannot be cached because cloudflare thinks it’s dynamic. Set your web server to cache the content of whatever is not being cached, and then set Origin Cache Control in cloudflare to on.

While in a short period of time from a single location, you may receive 100% cache hits, this is not the expected behavior of Cloudflare (like many caches). Let’s say you set Cache-Control: max-age=31536000, in a perfect world, your server would see a request once per year for that resource, and I’d love to get closer to that. In reality, there are a bunch of reasons you’ll see many more requests at your origin, if you using Cloudflare as your cache:

  1. There are multiple PoPs (points of presence), and each has its own separate cache. If you’re paying extra for Argo like I am, you’ll get Tiered Caching where cache misses on tier 2 Cloudflare nodes will try to ask other nearby tier 1 Cloudflare nodes for things to improve your cache hit rate, but it’s not guaranteed, and support said they can’t define exactly how this will work.
  2. Cloudflare won’t cache stuff indefinitely. They say that things that are frequently accessed should remain cached, but there are no guarantees. In this sense, max-age is merely a hint for the cache.

The tricky part is it’s hard for me to tell exactly how often I get cache misses without a bunch of analysis. I’m currently rolling out logging via Cloudflare Workers, which will help me see hit rates for individual URLs, but I do clear cache manually regularly when we push updates, so even with that data, it’s hard to do a perfect analysis.

I’ve recently implemented a cache with KV for a search API. So far it’s worked quite well and only a small amount of queries/requests reach our origin servers now. Only issue we’ve encountered is that KV unfortunately isn’t the best solution for caching, as it can add quite a lot of latency.
KV latency is only low if it’s frequently accessed in the PoP you’re using, KV isn’t stored in all PoPs, so it has to go through central PoPs quite a lot, which can sometimes up the latency by about 300-500ms.

Currently working on improving our caching solution by having distributed KeyDB/Redis nodes that basically race together with CF’s KV which hopefully reduces our latency in some regions

Which locations/country do you see this latency?

Thanks for sharing. My origin can return responses within 50-200 ms of most of my users, so 300-500 ms latency would not be worth it. I did see some other posts mentioning latency like that too, which was a little concerning. Would be good to know what locations you’re working with. I’m mainly in the US.

JP, GB are a couple examples in which I see those response times. One thing to note though is that I’ve measured it in the worker, so it may be off by a bit sometimes. Unfortunately had to turn off my response time logging off that I deployed a bit ago as a small percentage of requests got a exception, so looking into that currently before I can continue logging the response times on them.

This application in particular is pretty worldwide, however most traffic we get is from the US, but our average response time there is quite good, no complaints there really.

I’ve modified previous examples I found and have been using the following to cache all requests and log data to an ELK server. I have a 100% hit rate for what I want. I control the caching TTL’s via the origin server (page rules -> origin cache control -> on):

addEventListener('fetch', event => {
  event.respondWith(noCacheOnCookie(event))
})

async function noCacheOnCookie(event) {

    let request = event.request;
    var pass = false
    
    // Check excluded pages & routes
    const url = new URL(event.request.url)
    const path = url.pathname
    const excludedPages = ['admin-ajax', 'wp-login', 'wp-admin', 'wp-json', 'admin', 'cart', 'my-account', 'wc-api', 'account', 'checkout', 'wc-ajax', 'addons', 'add-to-cart', 'remove_item', 'logout', 'lost-password']
    if(excludedPages.some(el => path.includes(el)) ) { var pass = true }

    // Check excluded cookies
    const cookie = request.headers.get('Cookie')
    if(cookie){
        const excludedCookies = ['wordpress_logged', 'comment_', 'wordpress_sec']
        if(excludedCookies.some(el => cookie.includes(el)) ) { var pass = true }
    }

    if (pass === true) {

        // Get non-cached URL
        const bustedRequest = new Request(request, { cf: { cacheEverything: false } })
        const response = await fetch(bustedRequest)

        // Append cache busted header for debugging
        const newHeaders = new Headers(response.headers)
        newHeaders.append('wp-cache-busted', 'true')

        // Get new response object
        const newresponse = new Response(response.body, {
            status: response.status,
            statusText: response.statusText,
            headers: newHeaders
        })

        // No UA's log in so log & return response immediately
        event.waitUntil(logToElk(event.request, newresponse));

        return newresponse;

    } else {

        // Get response object
        const response = await fetch(new Request(request, { cf: { cacheEverything: true } }));

        // Log data if not specified user agents
        const ua = request.headers.get('user-agent') 
        if( !(ua.includes('uptime') || ua.includes('BWT')) ) {
            event.waitUntil(logToElk(event.request, response));
        }

        return response;

    }

}

async function logToElk(request, response) {
  
    // Debugging
    //console.log(response.status);
    //console.log(new Map(request.headers))
    //console.log(new Map(response.headers))

    var ray  = response.headers.get('cf-ray') || '';
    var id   = ray.slice(0, -4);
    var data = {
      
      // General request data
      'timestamp':  Date.now(),
      'status': response.status,
      'method': request.method,

      // Get client request data
      'client-url': request.url,
      'client-host': request.headers.get('host') || '',
      'client-encoding': request.headers.get('accept-encoding') || '',
      'client-referer': request.headers.get('referer') || '',
      'client-ip': request.headers.get('cf-connecting-ip') || '',
      'client-xff': request.headers.get('x-forwarded-for') || '',
      'client-xproto': request.headers.get('x-forwarded-proto') || '',
      'client-country': request.headers.get('cf-ipcountry') || '',
      'client-ua': request.headers.get('user-agent') || '',
      
      // Get CF response data
      'cf-ray': ray,
      'cf-id': id,
      'cf-colo': request.cf.colo || '',
      'cf-tlsVersion': request.cf.tlsVersion || '',
      'cf-protocol': request.cf.httpProtocol,
      'cf-cache-control': response.headers.get('cache-control'),
      'cf-content-type': response.headers.get('content-type'),
      'cf-encoding': response.headers.get('content-encoding'),
      'cf-cache-status': response.headers.get('cf-cache-status'),    
    
    };
    
    // Authorise requests only
    const url = "https://elk.example.com";
    const compiledPass = btoa('user:pass');
    const fullAuth = 'Basic ' + compiledPass;

    await fetch(url, {
        method: 'PUT',
        body: JSON.stringify(data),
        headers: new Headers({
          'Content-Type': 'application/json',
          'Authorization': fullAuth,
        })
    })
}

I’ve noticed that if you set “Cache-Control: public max-age=0 s-max-age=31536000” on non-default cacheable objects e.g html then Cloudflare ignores the s-max-age and always lists the status as Expired. However, if you set the “Cache-Control: public, s-max-age=31536000” then it caches it as per the s-max-age directive.

The one annoying thing is that I still can’t get the CF encoding type sent back from the workers to log although it is delivered as what the client requests e.g br, gzip. I presume this is because at time of logging CF has not yet performed the decision on what compression to send to the end user.

To provide this performance boost, KV caches are carefully tuned to minimize response times and maximize the probability of caching request data!

Can you elaborate on the performance boost part? I haven’t really noticed KV being that fast usually in my use case

KV is good for read-heavy, write-light workloads. If you’re not requesting very often, you may not see a speed boost, but if you are, you should see much faster times. It really does depend on access patterns and volume.

1 Like

Would be awesome if there’d a version of KV that stores data in more DCs than just the central ones. Making it useful as a worldwide caching solution that’s fast. Better performance at the cost of a lower storage limit and such sounds pretty nice

Yep, we’ve heard this from several folks, and might offer it in the future. We’ll see!

1 Like

I’ve been exploring similar ideas and was wondering if using another additional “middle tier” CDN with origin shield support might be an alternative to (ab-)using KV store for caching.

Argo Tiered Caching doesn’t seem to be as strict as Origin Shields from other CDNs, where it’s guaranteed that POPs will pull data from specific Shields instead of the origin.

Examples:
https://docs.fastly.com/en/guides/shielding

The idea would be to have: Cloudflare Edge Cache > Fastly CDN > Shield > Origin

By using custom cache keys (if possible immutable) in Fastly/Stackpath and caching requests to the max there it should be possible to achieve the holy grail of asking the origin server just once for a file.

Have you considered this as an option?

Best

Funny you mention that. The Fastly path with Origin Shield and edge scripting was what I imagined us moving to originally–until Cloudflare announced workers. I still miss that you can’t clear cache via tags without Cloudflare Enterprise. It does seem more complex than is ideal, but that might actually be the best option right now, and I wonder if anybody has actually done this. I don’t see why it wouldn’t be great besides the extra infrastructure.

This would actually solve another issue I’ve been having with Workers. We’re migrating to only using Workers for scripting and caching and routing to multiple origins by path vs before I had a reverse proxy with scripting and caching and load balancing doing all those things. Before, I had the reverse proxy alerting Cache-Control headers before the requests reached Cloudflare, which would allow me to have Cloudflare respect origin headers, while using stuff like s-maxage or the revalidation rules, while still respecting no-cache. Since Workers run after the fetch cache, the best way to force caching is via cf.cacheTtl, but then I risk breaking that app if I override a response that had no-cache headers. I could be smarter about this using the Cache API, but because that doesn’t support tiered caching, that’s a huge disadvantage.

If you try this, let me know how it goes! Now that Fastly has Lucet now, that’s exciting too.

I might actually need to implement that sooner than I thought (as mentioned in another thread we ran into rate-limiting issues with our current caching worker in production), my Plan B would be to rip out some parts of the worker (like the Cache API, which counts as sub request) and leverage Fastly for that.

Our use-case might be a bit specific (we want to cache our API) but we still draw benefits from a worker:

  • 99% of our requests are POST, virtually no CDN can (or rather wants to) cache those. With a worker we can convert them to GET
  • Many different requests can in effect result in the same origin response, hence we need a worker to calculate a custom cache key for us (based on normalized request payload)
  • Cache invalidation: We do this “entity” based. E.g. we have 1000 entities and their lastModifiedAt timestamp stored in Workers KV and look up this modification date for the requested entity every time we receive a request.

Right now we use the above in combination with the Cache API with great success but unfortunately ran into those pesky rate-limiting issues.

My Plan B solution would be to keep most of the logic in the worker but instead of using the Cache API just forward the (now) GET request with a x-cache-key (computed cache key + entity lastModifiedAt) header to Fastly, use our custom cache key over there and store the response to infinity.

By using a immutable cache key (sha256(normalized payload) + entity lastModifiedAt) the Fastly CDN would be a rather dumb cache while the caching logic stays with the worker (this is to avoid stale content issues).

The CF edge cache would be caching the Fastly response as well.

As a nice side effect we could use the Origin Shield feature and should achieve our goal of hitting the origin only once for a specific request, until the affected “entity” has been changed.

If a new/uncached request is being made or the affected entity has been changed since, the request would go through to our origin servers but be cached immediately by it’s shield, so the same request from different countries would not hit our origin server again (which is currently the case when using CF edge caching). :slight_smile: