Referencing the following community post: Rate Limiting by User Agent Anomaly
We are an Enterprise customer currently utilizing global rate limiting rules for [edit: our website’s] navigation and product details pages or our existing account.
We are in a similar situation as the OP from Nov 2020. For almost a year now we’ve been seeing large traffic spikes from an apparent white listed crawler - facebookcatalog/1.0. I can find no facebook sanctioned documentation about this crawler and we’ve escalated several tickets with them in an attempt to get Facebook to dial down these spike requests/throttle it - they seem to occur at regular 24 hour intervals. A low level number of requests do happen continuously however. These spikes usually double or triple our normal traffic albeit over a relatively small time span (ten minutes). As a result we’ve enabled a blanket User Agent Blocking rule via Firewall → Tools → User Agent Blocking. We are concerned about site stability issues if the current User Agent ban we have in place for this Agenet were to be removed completely.
What would like to do is institute rate limiting for this specific User Agent. In looking at the settings for creating any new rate limiting rule the rule states "If traffic matching the url (http|https) (.mysite.com/) from the same IP address exceeds (10) requests per (second|minute|hour)…
These requests are coming in from many different IP address - most same country and some overseas. We’ve confirmed that these are, in fact, Facebook owned IP addresses so it appears this is legit crawler traffic.
Further down the Create a Rate Limiting Rule interface is conditional HTTP Response Header(s) check with a default of Cf-Cache-Status Not Equals HIT
…
What we really need is not a Response Header check from our origin servers but a Request Header check, something similar to ClientRequestUserAgent Equals facebookcatalog/1.0
Is there some workaround/method of rate limiting by specific UserAgent on the request instead of checking the origin response? Something half-way between a UserAgent block and a rate limiting rule?
To add: we don’t want to impact all bot traffic with Super Bot Fight Mode. We also do not currently subscribe to Bot Management. Thanks.