Block expression: (cf.client.bot and cf.verified_bot_category eq “AI Crawler”)
GPTBot user agent is showing up in my origin logs, despite being blocked by the above rule. The rule is blocking all known bots in the AI Crawler category, for now.
I checked the bot’s IP and it’s fake in origin logs. As with any fake search engine bot, a fake GPT UA tells the visitor if the site is blocking legit crawlers via IP when the fake version gets a 200 status code, as mine did.
The fix can easily be adding another rule to block requests with GPTBot in the UA, but that feels clunky to me, as does needing multiple rules to deal with the same UA string.
Question: Surely a declared UA without the right “known bot” IP address is likely a bad bot, but bot fight mode isn’t catching it? Can I fix that with super bot fight mode?
Any suggestions on dealing with fake GPT bots beyond what I’ve mentioned above? Would super bot fight mode catch these? I’m still on the fence about AI bots, but I have had no traffic at all from openai efforts, just more direct competition and higher bandwidth numbers so… help?
Also: I assume the rule blocks Google Bard crawling, which is likely to tank my rankings in Google SGE results? Do people use Bard to disect pages like openai does?
I know this is a different question, but I’d like to consider it while fixing the original issue.
You can shorten this expression to
cf.verified_bot_category eq “AI Crawler”.
Every verified bot of category X is already a verified bot.
I don’t believe it is blocked by the above rule. While the documentation would certainly indicate this is the case, Cloudflare Radar actually categorizes GPTBot in the category “other”.
Bot Fight Mode does not block, it issues a JS challenge. If you find that a bad bot has reached your origin, you can check the Cloudflare security events log to find out whether it has been issued a challenge or not. Make sure that the bot has reached your server via Cloudflare and not directly.
Possibly. Super Bot Fight Mode allows you to Allow/Block/Challenge “Definitely automated” and “Likely Automated” bots. It is the superior choice if you want some control over which bots can reach your server.
A pro subscription is not terribly expensive, so you could just give it a try.
Thanks for that response, Laudian. Very much appreciated, informative, and right on!
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.