Hello, curious if anyone has found a clever way to trick the Cloudflare rate limiting feature to do limits by ASN rather than IP? I’m dealing with a website that is constantly posted on Facebook, which then causes the user agent “facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)” from FB’s ASN 32934 to DDoS the site with literally tens of thousands of concurrent connections. Unfortunately the connections are spread among only a thousand or two IPv6 addresses, and each address only makes a few requests. This has made a rate limit rule of even 2 per 1 minute ineffective, because there’s still thousands that get through at the same time. That bot also does not honor robots.txt. Blocking it completely would be undesirable because there are rare occasions it behaves nicely and does what it is supposed to, which is load the thumbnail and metadata of the target page for others on FB to see.
I would look at doing an IP access rule for Facebook’s ASN to block the requests. While rate limit sounds like what is needed it works on the URL requested and is not IP-based while the IP access rules are. This means you wouldn’t be able to limit the rate-limiting to just Facebook.
I created a mildly functional workaround by doing a rate limiting rule on * and then a ‘bypass’ firewall rule for rate limiting against all user agents that do not match on the pattern facebookexternalhit. This has had at least a semi desirable result of only rate limiting the facebookexternalhit bot, and even today it turned what was going to be a 13,000 simultaneous request deluge into ~1000 simultaneous requests, but 1000 is still more than we want hitting the dynamic pages at the same time. I’d really like to avoid blocking entirely because it makes for ugly target URL posting and this site would like to have them display properly, but if no way to make that bot behave properly may have to go that route.
My understanding is that FB are just looking for the OG metadata on the page. Are you able to ensure that when Facebook make the request that the file is fetched from Cloudflares cache? Even if the content is generally not cached, you might be able to use various CF features to generate a cacheable page just to be served to Facebook.
Are your site’s URL cacheable ? If so you can use Cloudflare Transform rules to rewrite the query string segment of the URL to strip the query strings if any from what Cloudflare will evaluate as the URL path https://developers.cloudflare.com/rules/transform
see some examples at https://developers.cloudflare.com/rules/transform/url-rewrite/examples
closest example would be https://developers.cloudflare.com/rules/transform/url-rewrite/examples#rewrite-url-query-string-of-blog-visitors but instead of rewriting query string to static
?sort-by=date , you’d just rewrite to static and set the field empty. So preserve the intended path but rewrite static the query string. With the document example then requests to /blog/?querystring would be evaluated by Cloudflare and transformed to /blog/
That’s what I do on my forums and Wordpress sites but I target a list of known bad ASNs and specific cookies or absence of specific cookies so that such Cloudflare Transform rules to strip query strings only apply to guest visitors and not logged in visitors.
Using a Cloudflare Transform rule allows you finer control of when it should be used i.e. you can set criteria by ASNs, country, hostname, continent, IP source, refer, request method, cookies, user agent and even query strings you can include/exclude from the transform process. You can target the ASN 32934 only to strip query strings.
Then once you transformed the URL, set a page rule to cache everything and edge cache TTLs set it to cache duration you desire. So cache /blog/ URL
Or you can use Transform rule to rewrite/add query string for ASN specific requests and then set cache everything rule for set cache TTL only for those rewritten query strings.
You can use
If it’s an Enterprise plan, which is totally awesome to know, as I’ve not looked that deeply into Rate Limiting on an Enterprise plan.
This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.