Asset unexpectedly removed from the edge cache

Prerequisites:

  1. I have a Flutter PWA web application and I intend to use the Cloucflare CDN. I use the free version for development.

  2. Such an application needs one index.html, one main.js and instead of hundreds of .css, .js, .png assets there are thousands of binaries (as a parameter for Flutter widgets).

  3. All my binaries are stored in one large file and are available via HTTP range request. This solution (one large binary file instead of thousands of small ones) simplifies the installation of a PWA application for offline use.

There is caching problem with this file. It is unexpectedly removed from the edge cache if the request to it is not made for several hours.

Configuration

  1. I do not use any cache-control headers on my source server.

  2. I have set this CF Page Rule::
    Browser Cache TTL: a month, Cache Level: Cache Everything, Edge Cache TTL: a month

  3. The first range request (after Purge Everything) returns cf-cache-status: MISS or EXPIRED.
    The following range requests for the same file correctly return cf-cache-status: HIT.

After a few hours (without making a request to this asset), the asset is removed from the CF edge cache. Next request to it means reloading from the source server (MISS or EXPIRED cf-cache-status ).

Questions:

  1. Is this the required behavior?
  2. Isn’t that a free version limitation?
  3. Isn’t it my fault in setting up headers on the source server, in the CF configuration or in the Web browser request headers?

Response dump

curl -i https://data.wikibulary.com/data/plain/fr.proto -H "Range: bytes=30000000-30000100,38000000-38000100"

Part of response:

HTTP/1.1 206 Partial Content
Date: Wed, 27 Jan 2021 12:58:47 GMT
Content-Type: multipart/byteranges; boundary=00000000000032713278
Content-Length: 460
Connection: keep-alive
Set-Cookie: __cfduid=d30256c8a83fa2a0ac67f5d343e23f9481611752327; expires=Fri, 26-Feb-21 12:58:47 GMT; path=/; domain=.wikibulary.com; HttpOnly; SameSite=Lax; Secure
Last-Modified: Wed, 27 Jan 2021 12:37:46 GMT
ETag: 0x8D8C2C0594F33D4
x-ms-request-id: ca9249a6-d01e-0048-71ab-f4c332000000
x-ms-version: 2009-09-19
x-ms-lease-status: unlocked
x-ms-blob-type: BlockBlob
Access-Control-Expose-Headers: x-ms-request-id,Server,x-ms-version,Content-Type,Content-Encoding,Content-Language,Cache-Control,Last-Modified,ETag,x-ms-lease-status,x-ms-blob-type,Content-Length,Date,Transfer-Encoding,cf-cache-status
Access-Control-Allow-Origin: *
Cache-Control: max-age=2678400
CF-Cache-Status: HIT
Age: 386
cf-request-id: 07e585dfab0000412bd6b36000000001
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Report-To: {"group":"cf-nel","endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?s=3DGPosxWBJBe0NTpAx0PPNTlAj%2BNgD9bJ04eznQnva8l2KxB1VEQ0lpe%2FwCYijoaTTEufxb954GFR%2BDBCj1wP%2B44d128RJwOIS6ZG2hHgb29A8Fz"}],"max_age":604800}
NEL: {"report_to":"cf-nel","max_age":604800}
Server: cloudflare
CF-RAY: 6182a5ac4eaa412b-PRG


--00000000000032713278
Content-Type: application/x-protobuf
Content-Range: bytes 30000000-30000100/90417402

This is normal. Regardless of what you set Edge Cache TTL to, unused assets will be evicted to make room for others. Edge Cache TTL is more like a “How long will it be ok to cache a file so it doesn’t mess anything up?” setting.

+1 to this—we physically cannot store all of our user’s cached assets at all times on all servers in all of our data centers, so this process of cache eviction for lesser-used assets is normal. When the assets heat up (i.e., when more people try to use your PWA at the same time), they won’t be evicted. While I can’t remember the exact details of the ordering of cache eviction when range requests are at play, I wouldn’t be surprised if part of the reason why it’s being purged from the cache is precisely because it’s one big file. You might find that if you split it up into individual files, the caching system might be able to better prioritize assets for keeping/removal and, as such, see lower eviction rates.

A much simpler alternative possibility: you’re just not hitting the same data centers/servers or are just having some bad upstream routing luck, so when you look for the cached file a few hours later, you hit a server that doesn’t have it cached.

Argo Tiered Caching can also substantially increase cache hit rates across the board, so you might give that a try depending on how much this bothers you.

Hope this helps,
Jon

4 Likes

Thank you very much for the detailed and logical explanation. Unfortunately spliting one big file to small one is not fesasible for me.
My application aggregates dozens of different open source language dictionaries into one. It contains tens of millions of small assets. The probability that 2 people on each of the 200 cloudflare servers will request the same entry (within a few hours interval) is negligible.
Its a pity that cloudflare can’t solve my problem because the response speed when autocomplete and filtering is incredible.

It sounds like what you need for that file is a more traditional CDN, like BunnyCDN. You’d be able to set up a ‘cdn’ subdomain set to :grey: and let Bunny’s global network handle the file(s).

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.