Firewall Priority Order

I am looking to block bots that imitate GoogleBot, but not actually block Google itself.

In the firewall tab, if I block the user agent, but then allow Google IP address access, would this work?

Would the firewall ignore the user agent block because the IP is allowed, or would it still block it because of the UA block?

Why not make that a single rule? That should allow for the proper logic.

3 Likes

FYI, Cloudflare already does this (I think for all plans, but it might be Pro+ only if it’s part of the WAF)

$ curl -H 'user-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' https://judge.sh 


<h1 data-translate="block_headline">Sorry, you have been blocked</h1>

We do not have that feature in the Professional Plan. I am working on this rule but might be having trouble detecting Google… Google references this, but I’m not sure if that is correct? https://support.google.com/webmasters/answer/80553?hl=en

(http.user_agent contains "Googlebot" and not http.host contains "google.com" and not http.host contains "googlebot.com")

Turns out it’s enabled via firewall rules in the “Cloudflare Specials” group -

image

@jrichard Your solution is checking the host header, that support article recommends running some DNS requests to validate the rDNS for an IP is Googlebot, something Firewall rules don’t do. You’ll need to use the above WAF rule to block fake googlebots.