I have a json endpoint on a specific path, that i need to keep cached for as long as i can, cause it’s very database intensive. It generates thousands of URLs as requested by the users, and then each should be cached for 14 days.
But then, everyday at 00h(UTC), Cloudflare is cleaning the cache ONLY for this path, for no reason:
In the graph you can see that, when i was using APO, it wasn’t holding the cache for this path. I was using the same page rule, and removing all cookie headers.
With APO: requests were hitting Origin all day, and Cloudflare only from 00h to 4am.
With Page Rule: Cache is served from Cloudflare all day, and from origin from 00h till it’s full again.
I had to create a new database only for this endpoint. Precisely at 00h, the load spikes at 40 (for a 2 vCPUs managed DB).
I’m 100% sure that there’s nothing in our end purging this cache. We don’t have any code in place to clear this specific path. As i said, this path generates a few thousand URLs. There’s no way to clear one by one manually, and i’m not purging the entire site:
There are no traffic spikes, or attacks. This json is not a public path, it’s used only to generate some widgets in my theme:
Update Jul/15: In this specific case, it wasn’t a bug in cache. Our script had a date string that changed everyday. It should be static, but an old dev made it like this and didn’t told us. But at the end, this topic led to other interesting points about caching
Edge Cache TTL means “each should be cached for no more than 14 days”. TTL is actually a Max setting. Cloudflare will evict anything it deems to be “wasting space” in far less than your TTL.
Now the question is why is Cloudflare singling out just this one resource for eviction. Maybe @yevgen has some insight into the differing behavior.
I knew that CF could purge cache that is not being used, to save space. But it makes no sense to purge an entire path at once. Those URLs are being accessed, as we can see by the huge spikes i get in my database.
If it’s a matter of saving space, it should be deleting specific urls that are not accessed frequently…
Here we go again…
This database is dedicated to this specific path, and is receiving only 70% of the requests. The other 30% i’m redirecting to another DB to try to balance the load.
Please raise a support ticket and we will pass it to the Cache team.
I tried, but i couldn’t. Yesterday i’ve waited for more than an hour in this screen, before creating the topic here in the Community.
It says that it’s running some diagnostics, and the “Next” button is disabled.
I’ll try again latter today…
Just do it via email from the address associated with your account:
support AT cloudflare DOT com and post the ticket number here when you get a reply.
Done. Ticket number
When did the issue started to happen? I will rollout a fix later today for APO integration with Cache everything, that could make things better.
I’ve rolled out the change, please let me know if it made any difference.
Oh cool! I’ll test it right now.
This issue got mixed up with a previous ticket. I wasn’t using APO anymore. Now i am, and it seems to be working with one problem: it’s not respecting the “Origin Cache Control” rule anymore.
content-type: application/json; Charset=utf-8
cache-control: Public, max-age=600 , stale-if-error=172800
link: <https://tecnoblog.net/wp-json/>;; rel=“https://api.w.org/”;
strict-transport-security: max-age=31536000; includeSubDomains; preload
access-control-allow-headers: Authorization, X-WP-Nonce, Content-Disposition, Content-MD5, Content-Type
access-control-expose-headers: X-WP-Total, X-WP-TotalPages, Link
expect-ct: max-age=604800, report-uri=“https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct”;
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400, h3=":443"; ma=86400
The setup: Json + Cache Everything + Origin Cache Control + APO.
About the supposed bug mentioned in the title
I owe you guys an apologize. The URLs had a
maxDate string that was updating everyday. The dev responsible for the code doesn’t work with us for more than an year. And at that time, he’d only mentioned that the date was static, the script was heavy, and that a long cache (7-day TTL) should be enough. Even he didn’t realised that, with a dynamic string, it was only going to be cached for one day…
Putting this “string issue” aside (which was not related to APO, i wasn’t using it anymore), there really was an issue in caching Json when APO was enabled, and now it got fixed (after @yevgen’s update). But now i can’t control the cache using Origin Cache Control anymore…
At the moment APO’s 30 days caching is applied, we allow to override it with Edge Cache TTL page rule https://developers.cloudflare.com/automatic-platform-optimization/reference/page-rule-integration. I’m not sure Origin Cache Control should take precedence in the setup where APO is in place.
I thought the default behavior of APO would be to bypass Json requests, and then, it would be handled by the Page Rule.
I tried the new
cdn-cache-control header, and it didn’t work either. =/
Since it’s a json, there are some URLs that i need short TTLs (like 10 min) and others that i can cache for a day. I can’t do that with Edge Cache TTL.
So for now, the only option would be to disable APO, right?
I see your point. I will make sure when APO + Cache Everything are in place we bypass APO caching rules and default logic of Origin Cache Control will apply.
In the future, it would be great if APO TTL could be controlled using the new
cloudflare-cdn-cache-control headers. It would simplify the process in cases where so many specific rules are needed
I am planning to work on the change that will have the following precedence from the most important to the least:
edge TTL → cloudflare-cdn-cache-control → cdn-cache-control → default APO TTL (30 days)
In case of APO + Cache Everything we won’t apply default APO TTL so the precedence will be:
edge TTL → cloudflare-cdn-cache-control → cdn-cache-control → cache control
It means that:
Resources that match a Cache Everything Page Rule are still not cached if the origin web server sends a Cache-Control header of max-age=0 , private , no-cache , or an Expires header with an already expired date. Include the Edge Cache TTL setting within the Cache Everything Page Rule to additionally override the Cache-Control headers from the origin web server.
In both cases, the priorities are only to control the TTL, right? APO would still be in front of everything, serving the cache.
Hi Yevgen, sorry to bother again, but do you have a ETA for this implementation?
Just asking to prevent the topic from closing tomorrow.