"Cache Everything" – json expiring at 00h (UTC) everyday

Hi There,

I have a json endpoint on a specific path, that i need to keep cached for as long as i can, cause it’s very database intensive. It generates thousands of URLs as requested by the users, and then each should be cached for 14 days.

But then, everyday at 00h(UTC), Cloudflare is cleaning the cache ONLY for this path, for no reason:

In the graph you can see that, when i was using APO, it wasn’t holding the cache for this path. I was using the same page rule, and removing all cookie headers.

With APO: requests were hitting Origin all day, and Cloudflare only from 00h to 4am.
With Page Rule: Cache is served from Cloudflare all day, and from origin from 00h till it’s full again.

I had to create a new database only for this endpoint. Precisely at 00h, the load spikes at 40 (for a 2 vCPUs managed DB).

I’m 100% sure that there’s nothing in our end purging this cache. We don’t have any code in place to clear this specific path. As i said, this path generates a few thousand URLs. There’s no way to clear one by one manually, and i’m not purging the entire site:

There are no traffic spikes, or attacks. This json is not a public path, it’s used only to generate some widgets in my theme:

Any ideas?

Update Jul/15: In this specific case, it wasn’t a bug in cache. Our script had a date string that changed everyday. It should be static, but an old dev made it like this and didn’t told us. But at the end, this topic led to other interesting points about caching :slight_smile:

1 Like

Edge Cache TTL means “each should be cached for no more than 14 days”. TTL is actually a Max setting. Cloudflare will evict anything it deems to be “wasting space” in far less than your TTL.

Now the question is why is Cloudflare singling out just this one resource for eviction. Maybe @yevgen has some insight into the differing behavior.

2 Likes

I knew that CF could purge cache that is not being used, to save space. But it makes no sense to purge an entire path at once. Those URLs are being accessed, as we can see by the huge spikes i get in my database.

If it’s a matter of saving space, it should be deleting specific urls that are not accessed frequently…

2 Likes

Here we go again…

This database is dedicated to this specific path, and is receiving only 70% of the requests. The other 30% i’m redirecting to another DB to try to balance the load.

1 Like

Please raise a support ticket and we will pass it to the Cache team.

5 Likes

Hi Yevgen,

I tried, but i couldn’t. Yesterday i’ve waited for more than an hour in this screen, before creating the topic here in the Community.

It says that it’s running some diagnostics, and the “Next” button is disabled.

I’ll try again latter today…

1 Like

Just do it via email from the address associated with your account: support AT cloudflare DOT com and post the ticket number here when you get a reply.

2 Likes

Done. Ticket number 2205315

4 Likes

When did the issue started to happen? I will rollout a fix later today for APO integration with Cache everything, that could make things better.

2 Likes

I’ve rolled out the change, please let me know if it made any difference.

2 Likes

Oh cool! I’ll test it right now.

1 Like

This issue got mixed up with a previous ticket. I wasn’t using APO anymore. Now i am, and it seems to be working with one problem: it’s not respecting the “Origin Cache Control” rule anymore.

content-type: application/json; Charset=utf-8
cf-ray: 66ef2e8ff826f1d2-GRU
age: 8021
cache-control: Public, max-age=600 , stale-if-error=172800
link: <https://tecnoblog.net/wp-json/>;; rel=“https://api.w.org/”;
strict-transport-security: max-age=31536000; includeSubDomains; preload
vary: Accept-Encoding
cf-cache-status: HIT
access-control-allow-headers: Authorization, X-WP-Nonce, Content-Disposition, Content-MD5, Content-Type
access-control-expose-headers: X-WP-Total, X-WP-TotalPages, Link
cf-apo-via: tcache
cf-edge-cache: cache,platform=wordpress
expect-ct: max-age=604800, report-uri=“https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct”;
ngx-cache-status: BYPASS
x-content-type-options: nosniff
x-robots-tag: noindex
report-to: {“endpoints”:[{“url”:“https://a.nel.cloudflare.com/report/v3?s=nnaEbbtK%2BunuVZCFScXPS63uk8dbsBJnGYWBwQ%2BLwhFUFk25gB%2BjZFNNWV3qANUWON3YoqYZ3JP9qdXWRE1Svi%2B8VBGwtAIAZsHEw07RAbFJ%2Bt17RpjaZceKl7YfaW4%3D”}],“group”:“cf-nel”,“max_age”:604800}
nel: {“report_to”:“cf-nel”,“max_age”:604800}
server: cloudflare
alt-svc: h3-27=“:443”; ma=86400, h3-28=“:443”; ma=86400, h3-29=“:443”; ma=86400, h3=“:443”; ma=86400

The setup: Json + Cache Everything + Origin Cache Control + APO.

About the supposed bug mentioned in the title

I owe you guys an apologize. The URLs had a maxDate string that was updating everyday. The dev responsible for the code doesn’t work with us for more than an year. And at that time, he’d only mentioned that the date was static, the script was heavy, and that a long cache (7-day TTL) should be enough. Even he didn’t realised that, with a dynamic string, it was only going to be cached for one day… :roll_eyes:

Putting this “string issue” aside (which was not related to APO, i wasn’t using it anymore), there really was an issue in caching Json when APO was enabled, and now it got fixed (after @yevgen’s update). But now i can’t control the cache using Origin Cache Control anymore…

1 Like

At the moment APO’s 30 days caching is applied, we allow to override it with Edge Cache TTL page rule Page Rule integration with APO · Cloudflare Automatic Platform Optimization docs. I’m not sure Origin Cache Control should take precedence in the setup where APO is in place.

1 Like

I thought the default behavior of APO would be to bypass Json requests, and then, it would be handled by the Page Rule.

I tried the new cdn-cache-control header, and it didn’t work either. =/

Since it’s a json, there are some URLs that i need short TTLs (like 10 min) and others that i can cache for a day. I can’t do that with Edge Cache TTL.

So for now, the only option would be to disable APO, right?

1 Like

I see your point. I will make sure when APO + Cache Everything are in place we bypass APO caching rules and default logic of Origin Cache Control will apply.

6 Likes

In the future, it would be great if APO TTL could be controlled using the new cdn-cache-control and cloudflare-cdn-cache-control headers. It would simplify the process in cases where so many specific rules are needed :slight_smile:

1 Like

I am planning to work on the change that will have the following precedence from the most important to the least:

edge TTL → cloudflare-cdn-cache-control → cdn-cache-control → default APO TTL (30 days)

In case of APO + Cache Everything we won’t apply default APO TTL so the precedence will be:

edge TTL → cloudflare-cdn-cache-control → cdn-cache-control → cache control

It means that:

Resources that match a Cache Everything Page Rule are still not cached if the origin web server sends a Cache-Control header of max-age=0 , private , no-cache , or an Expires header with an already expired date. Include the Edge Cache TTL setting within the Cache Everything Page Rule to additionally override the Cache-Control headers from the origin web server.

3 Likes

Thats great!

In both cases, the priorities are only to control the TTL, right? APO would still be in front of everything, serving the cache.

1 Like

correct.

2 Likes

Hi Yevgen, sorry to bother again, but do you have a ETA for this implementation?

Just asking to prevent the topic from closing tomorrow.

1 Like