Page rule: match specific file extensions on all subfolders


#1

Is it possible to write a page rule that matches specific (several) extensions in all subfolders

for example if I want to match *.gz, *.zip, *.bz2, *.jar from all subfolders of domain.com/


#2

Nope, page rules only support wildcards, no regex.

You would either have to create a page rule for each subfolder

http://example.com/foo/*

Or for each extension

http://example.com/*.zip

Note that you can buy extra page rules if you need any to support this functionality.


#3

Thank you. Does it mean that example.com/*.zip would match zip files in any subfolder or just in the root of the domain?

Perhaps an option is to match on cookie or header. Is that possible?


#4

The wildcard will trigger with subfolders as well.

There is a business plan-only page rule option called “bypass cache on cookie”, but that only lets you control bypassing the CF cache.

What you could do is make sure to set the cache-control headers properly on your server if you’re just trying to manipulate the cache:


#5

Well, I do set cache control header, but it feels like cf edge cache didn’t cache anyway. This is for static files like zip, gz, bz2 and so on.


#6

Do those files show a cf-cache-status in the header? If so, then that’s a good start. You can ask the edge cache to hold onto that file, but it’ll get purged from that edge server if it’s rarely used.


#7

I see that some files do. Is there a file size limit ?


#8

The file size limit is 512MB. Amount of time the caching happens depends on how much requests the resource gets and the headers.


#9

I found a solution, I think. I made a page rule for domain.com/* with cache level: everything. It now works. I assume it follows cache max-age http header?


#10

You can always check what you get from the server to see if the max-age remains intact or not. My personal experience (maybe I did something wrong), is that it did cache everything, while overriding the max-age to a much larger number than the one I’ve set at my origin. I don’t get the idea (and think it’s probably a violation of some RFCs), and this pretty much made this feature useless for me, as my website’s core business is updates every 30 seconds of live news data.


closed #11

This topic was automatically closed after 14 days. New replies are no longer allowed.