Hello,
Currently I have the “Known Bots” rule allowed rule active in Cloudflare WAF.
And another rule blocking Tor, Unknown States and Chine and Russia.
However, after checking the events on the allow known bots rule I notice that 99% of those I want to block because they are useless and just wasting bandwidth on my DO droplet.
So what is the best or most practical method to ALLOW GoogleBot, but block most other bots?
You can edit your Firewall Rule to add the Google bot user agent, so the logic would be something like:
If Known-Bots
AND
User Agent contains "Googlebot"
Then ALLOW
However, it’s my personal opinion that bots — good or bad — should not be allowed based on their identity. Good bots can also be made to perform bad requests, and if you allow them, no other Firewall Rule will stop the request.
Instead you should add he Known-Bots operator, together with the UA restriction for Googlebot, but only in your Firewall Rule that you think it’s ok for Googlebot to bypass.
Here’s an example of a recent visit by Googlebot to a website, where it was (rightfully) blocked. I don’t know for sure how Googlebot got that URL, but I suspect it was a hacker creating links to several websites to probe for vulnerabilities.
Well behaved bots can always be asked to leave you alone using robots.txt. The Verified Bots Policy requires such bots to respect your policy. Just make sure you are not blocking bots from reading your robots.txt.