High % of cache MISS

Hello there,

We have a low cache hit ratio on our website. From the docs :

A typical website that’s mostly made up of static content could easily have a cache hit ratio in the 95-99% range

From what I’m seeing in the dashboard, we would be around 84% (~16% MISS).

We already have a page rule set to : Cache Everything on this website.

Our website is statically generated, and gatsby generates a lot of “page-data.json” requests due to its prefetch mechanics.

But the MISS request are not affecting only JSON files :

image

We also put a rule in our header through the nginx conf (which would potentially be the cause of this problem?) :

map $sent_http_content_type $expires {
   [...]
  application/json 1m;
}

Any help would be appreciated.

How much traffic are you getting? If you don’t get a lot of traffic, or your traffic is very geographically diverse, you may see lower cache hit ratios, as Cloudflare purges least-used assets from cache first.

Traffic is in the order of several hundred thousand visitors (last 24 hours), so i would assume this is not the issue.

On the geographical aspect though… pretty much all of the traffic is split between Western Europe and the United States

Hmm, in that case I would expect a higher cache hit ratio. Could you share some of the URLs that return CF-Cache-Status: miss? In particular, check if any of them contain query parameters that could bust the cache. You may also greatly benefit from enabling Tiered Caching if you haven’t already.

3 Likes

Could you share some of the URLs that return CF-Cache-Status: miss ?

Sure, if you go on https://www.scaleway.com and open your “Network” DevTools tab, you will see “page-date.json” requests as you hover on some links. Some of them are MISSed.

In particular, check if any of them contain query parameters that could bust the cache

I don’t see any query params over the majority of the requests that seem to be in MISS state.

You may also greatly benefit from enabling Tiered Caching if you haven’t already.

I agree, I feel this Tiered Caching approach would help to solve the MISSes, as they seems related to the geographical aspect.

Thank you so much for your help, we’ll dig into that.

I see 7 page-data.json requests, all are either expired/revalidated or hit…

2 Likes

I see 7 page-data.json requests, all are either expired/revalidated or hit…

It’s hardly reproductible sometimes, but i definitely see some of them if I try hard enough.

Request URL : https://www.scaleway.com/en/docs/page-data/identity-and-access-management/secret-manager/how-to/create-secret/page-data.json

You may try https://www.scaleway.com/en/docs/ (expand the sections on the left menu, and hover over all the links you can see, that should trigger a lot of .js and page-data.json requests)

I finally got a MISS on https://www.scaleway.com/en/docs/page-data/compute/instances/reference-content/cost-optimized/page-data.json. It became a HIT at the second load, though.

My best bet here is that the structure of the website means that you have thousands of files, which don’t mix well with hundred of Cloudflare colos and a very short 60s cache period. I’d personally try:

  1. what @albert suggested, enabling tiered caching (it’s free and helps, if you want to reduce load on the origin).
  2. increase the cache period, by a bit if you keep the structure as is, or by a lot if you can implement a good cache busting method (similar to what Nuxt, Astro, etc. would with hashes or dates on files).

One note is that tiered caching won’t help reduce load times, as these requests are just for pre-loading future page loads users will maybe hit and the multiple hops for the cache won’t reduce latency either.

Thank you for you help and time @matteo, yes, we will try those 2 approaches and see how that goes.

One note is that tiered caching won’t help reduce load times, as these requests are just for pre-loading future page loads users will maybe hit and the multiple hops for the cache won’t reduce latency either.

Interesting. I would have assumed it would have helped with the load times, as more users would potentially hit the cache. Duly noted.

1 Like

No worries, I hit that site sometimes before logging in, so I’m glad it will get better :slight_smile:

It depends on location of origin and speed. If your origin is in Paris, and I were in Paris, going multiple hops makes no sense and would slow me down or at least not improve. Of course if I were in Auckland, things would be different. If you had more origins, then it’d be different, but I presume your origins are Paris and EU…

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.