At the time of writing, Cloudflare has successfully identified a hundred ‘good’ bots — these will never be presented with a JS challenge, but, naturally enough, you can block them with your own rules.
However, this is not the case if you use a third-party API, especially if it’s being provided from a cloud, which can have thousands (or millions…) of IP addresses that will legitimately attempt to connect to your system — and get promptly blocked by the JS challenge.
Bot Fight Mode, alas, at least for the free service, has the following fundamental characteristics:
You can either turn it on or off. There is no middle ground. Either you protect your whole domain, or you protect none of it. The choice is yours.
Bot Fight Mode always takes precedence over your WAF rules. This means that if Bot Fight Mode is enabled, you can, at best, block some of the ‘good’ bots with additional rules. What you cannot do is to allow traffic from certain well-known sources to bypass Bot Fight Mode. By design, this is not how it was implemented.
In theory, based on my understanding of the explanations posted here, if you have access to the IP Access Rules, you might be able to place a rule there to allow some servers to go through, bypassing Bot Fight Mode:
Bot Fight Mode keeps blocking my Next.js app calls to the backend API:
That’s the theory, but… read on!
Of course, legitimate bots — those from legitimate crawling & searching engines — should be allowed to access one’s websites without any kind of restriction in place. This is where things get tricky: how do you identify a bot as being legitimate? Obviously, you cannot rely upon the
User-Agent header, which is easily forged.
From what I understand of Cloudflare’s system, their ‘benign bot’ validation mechanism relies on two principles. Firstly, for any ‘new’ bot beyond those that are well-known, Cloudflare will determine its activity, based on pattern-matching with some sort of artificial intelligence. The idea is to capture the signature of a legitimate 'bot, and, thanks to having access to gazillions of data — logs of bots coming in from the same range of IP addresses, with the same
User-Agent header, retrieving content in a predictable way, and respecting the
robots.txt file (if it exists) — I can imagine that it’s not too hard to precisely determine if a certain bot is, in fact, what it seems to be, based on its behaviour. In a sense, what Cloudflare’s data-crunching AI is doing is to reverse-engineer the crawling algorithm used by a ‘benign bot’, thus being able to figure out if a certain request for data is, indeed, legitimate.
Considering that the JS Challenge mechanism is built on similar assumptions, I can imagine that the Bot Fight Mode uses something similar to automatically let legitimate bots go through the firewall.
The second principle, of course, is to examine a freshly submitted bot for evaluation, placing it in a restricted environment, and figuring out if it behaves in a way that is consistent with the algorithm of a ‘benign bot’. Some posts here on the community forums tend to imply that such requests for a new bot to be allowed through CF’s own firewall are often ignored/disregarded; I’d claim, however, that each request may, indeed, be honoured, but it will require a ‘quarantine period’ during which CF runs its tests.
Because Cloudflare is so sure that the overall mechanism works, they don’t even consider a few exceptions — some of which are sadly quite frequent.
Providing APIs behind Cloudflare protection
Here is my use-case scenario: how to successfully provide a web services API on a single server behind the Cloudflare protections, when the connections to it can come from any IP address in the world, and not just a limited set?
So, I have several domains (almost all of them registered with the Cloudflare Registrar), each of which may have more than one server — usually, they have many more, even if they’re (possibly) pointing to the same physical server. From the perspective of an external client, this is irrelevant: there are many different servers, each having its own FQDN. In fact, since clients will only see Cloudflare’s IP addresses — and possibly never gain access to the real server’s IP address — they will not even know where their request is being reverse-proxied through, and that’s exactly how we want Cloudflare to work.
But clients can come from any possible number of IP addresses. Consider an API that will be consumed by residential users or mobile users — all of which will get IP addresses randomly assigned from a pool, which often may not even be known in advance. Especially if you wish your API to be accessible from everywhere, not just one provider or even one country.
In the specific case where I stumbled upon the overeager Bot Fighting issue, things were even more complicated. The requests actually come from a virtual server out of a pool, managed via AWS. At any time, Amazon might switch the IP address where that particular instance is running — and from within the instance I might not be able to know what that instance’s specific IP address is, in advance of contacting my own server, hidden behind Cloudflare’s services. In essence, I’m making a request coming from a server that is virtually spread among a cloud, to a final destination which is actually a server under my control, but which, in turn, is also virtualised by Cloudflare’s cloud as well.
In such circumstances, the client never knows the IP address of the remote connection in advance; conversely, my physical server has no idea of the real IP address for the next request. It only knows what Amazon will tell it what to use. Although it’s possible for the client to make a quick local request to learn what IP address it’s running on, that address is useless from the perspective of someone setting up a WAF on my server — because you will only see requests coming from the Amazon Cloud Services, not from the ‘real’ server. And because such servers are actually not real, but virtual, created on-demand, it’s not even guaranteed that, when establishing a correlation between the current IP address (as reported by the local operating system) and the IP address currently assigned to a cloud proxy instance, such correlation will hold up in subsequent requests, since Amazon may assign a different IP address on the next time the instance is launched.
What this means is that one cannot even create a ‘dynamic’ configuration (via the Cloudflare API) where somehow this correlation is ‘translated’ into a well-formed WAF API rule, that can be selectively added in a fully automatic way. Somewhere in this process, a human will have no other choice but to intervene and manually change whatever rule is in place.
This can be done (to a degree) but it’s not deployable beyond testing purposes.
Different concepts, different views
The culprit, IMHO, is in the priorities given to the many layers of filtering. At the WAF level, you can specify several ways to filter out requests and mark them as allowed — by examining the headers and checking for specific markers (including the name — or URI — of the server to be contacted. You can certainly create rule to allow requests to go through a single server.
However, WAF rules cannot override the filtering done by the Bot Fight Mode. The rules for Bot Fight Mode will always override whatever the WAF rules say.
On the other hand, Bot Fight Mode is either turned on or off for the whole domain (erroneously misrepresented on the current documentation as
the server's name). There are a few more options (filter by some headers, for example, or add further headers to the request, etc.), but what you cannot do is to restrict Bot Fight Mode to a single FQDN. It might be possible with the Super Bot Fight Mode, but most definitely not with the ‘simple’ Bot Fight Mode.
This essentially means that if you have one web server providing API services to an unknown number of clients, coming from an unlimited pool of possible IP addresses, you have basically to turn Bot Fight Mode off for all web servers under that domain. There is no alternative — at least, as far as I can see and understand the documentation.
A proposal to Cloudflare
As a consequence of the way the many rule systems work at the different layers, for such sites, the only solution, for now, is shutting Bot Fight Mode off — which will also deprive Cloudflare from data worth analysing (so that future generations can be trained with machine-learning running on legitimate data and therefore learn to recognise it better).
There are a few solutions that come to mind:
- Introduce a quarantine mode, like what some complex email filtering systems do (e.g. SPF, DKIM, but also spam-fighting tools such as Rspamd, SpamAssassin…). This would allow Cloudflare’s machine-learning engine to benefit from getting trained with the extra traffic from legitimate APIs providing web services, while at the same time not block such traffic at the domain level.
Create a special rule at the WAF level that can override Bot Fight Mode. That would be the best solution, especially if it could be deployed simultaneously with the first rule, e.g. instruct the WAF to enter quarantine mode for some FQDNs, leave others on full active mode, and on some, turn it off. The complexity — and number — of rules allowed would depend on the user level (Free, Pro, Enterprise…).
An additional advantage of this approach would be the ability to check via the WAF’s rules if a request had an expected header or not (known to exist for legitimate requests only); if not, it would be very easy to block any simple-minded bot checking for WordPress vulnerabilities, for example.
- Allow Bot Fight Mode to be activated by FQDN as opposed to the whole domain. That would work, too, although it’s worth noticing that the WAF rules are finer-grained and more useful.