Though Edge Cache Enabled, 17k misses for every 14 Hits

WPEngine hit me up about some massive bandwidth usage and we are trying to figure out why, since most of the site is cached.

One such issue is a page on my site that Cloudflare claims have received 17k hits today, most of which are from Singapore (not sure why that is). Of these 17k hits, only 14 were served from Edge cache.

I opened the page and it seems to be properly cached by Cloudflare:

Age:
557
Cache-Control:
max-age=7200, must-revalidate
Cf-Cache-Status:
HIT
Cf-Edge-Cache:
cache,platform=wordpress

So not sure why 99.9% of requests are Misses.

Any suggestions? I created a wildcard page rule focusing on that page, but not sure if that’s enough.

I may have to do this with multiple pages as I continue to hunt down the bandwidth, which a lot seems to be happening due to Singapore requests

All feedback appreciated.

Hi,

First, make sure your server only accepts requests from Cloudflare. Otherwise your origin infrastructure is vulnerable to attacks via direct requests.

Also, check Analytics & Logs and filter those requests to see if there’s any pattern other than country of origin.

You can then craft a Firewall / Custom Rule with action Managed Challenge to either all requests from Singapore, or requests matching any pattern you may have identified.

If the excessive requests are for URLs containing random query strings (which usually lead to Cloudflare cache being bypassed), you could create a rule for when there’s a query string and it is not one of the most commonly used by your website.

Ok, wouldn’t a star (*) at the end of the rule acknowledge not to allow random query strings at the end to bypass cache?

Are you talking about a Page Rule?

If so, what happens is that, yes, Cloudflare would be instructed to cache every URL, including those with a query string. However, if there’s an attack with random query strings, each different query string would have it’s own cache, and would be cached on first visit, but never visited again, defeating the purpose of this rule.

What you can do if that’s the case is to use a Transform Rule to remove the query string if it’s not one of the commonly used by your website (such as s=... for search, or utm_source=... for analytics, etc.)

This look about right?

Also, how would I test if this is working? Is there a header to confirm?

You can just “preserve path” instead of rewriting it to the same value.

Also, you should start with a condition for when the query string is NOT empty:

AND
URI Query String does not equal ""

You should also make sure that the path in case is not part of a path for JS or CSS files, otherwise you need to add exceptions accordingly.

Transform Rules will not add a header automatically, though you could set another TR for that if needed.

But you’ll see it’s working by visiting the path with different query strings, and if it’s working as expected you should see CF-Cache-Status: HIT despite changing the query string. That’s because Transform Rules run before cache, so the query string will be dropped and the requests:

https://example.com/shop/?query-a
https://example.com/shop/?query-b

etc., will all be served out of https://example.com/shop/

Now, bear in mind that what started as a discussion on way to block those requests with random QS has shifted to a way to serve those requests out of cache, and those are two separate approaches.

When the query string is removed from the request, it will not be blocked by any WAF custom rules based on the query string, as most security features also run after Transform Rules. But they will be served out of Cloudflare cache, preventing the hit against your origin server. The requests would still be blocked or manage-challenged if you set a custom rule based on country of origin, for instance.

EDIT: fix condition for when QS is NOT empty.