Best Bot Firewall Setting

This rule will block bot traffic with user agents containing the strings “crawl,” “bot,” “spider,” and some other custom user agents.
1 Rule.
(http.user_agent contains “Yandex”) or (http.user_agent contains “muckrack”) or (http.user_agent contains “Qwantify”) or (http.user_agent contains “Sogou”) or (http.user_agent contains “BUbiNG”) or (http.user_agent contains “knowledge”) or (http.user_agent contains “CFNetwork”) or (http.user_agent contains “Scrapy”) or (http.user_agent contains “SemrushBot”) or (http.user_agent contains “AhrefsBot”) or (http.user_agent contains “Baiduspider”) or (http.user_agent contains “python-requests”) or (http.user_agent contains “crawl” and not cf.client.bot) or (http.user_agent contains “Crawl” and not cf.client.bot) or (http.user_agent contains “bot” and not http.user_agent contains “bingbot” and not http.user_agent contains “Google” and not http.user_agent contains “Twitter” and not cf.client.bot) or (http.user_agent contains “Bot” and not http.user_agent contains “Google” and not cf.client.bot) or (http.user_agent contains “Spider” and not cf.client.bot) or (http.user_agent contains “spider” and not cf.client.bot)

========================================================================

2 Rule
Random bots
(http.user_agent contains “360Spider”) or (http.user_agent contains “acapbot”) or (http.user_agent contains “acoonbot”) or (http.user_agent contains “ahrefs”) or (http.user_agent contains “alexibot”) or (http.user_agent contains “attackbot”) or (http.user_agent contains “backdorbot”) or (http.user_agent contains “becomebot”) or (http.user_agent contains “blackwidow”) or (http.user_agent contains “blekkobot”) or (http.user_agent contains “blowfish”) or (http.user_agent contains “bullseye”) or (http.user_agent contains “bunnys”) or (http.user_agent contains “butterfly”) or (http.user_agent contains “careerbot”) or (http.user_agent contains “casper”) or (http.user_agent contains “checkpriv”) or (http.user_agent contains “cheesebot”) or (http.user_agent contains “chinaclaw”) or (http.user_agent contains “choppy”) or (http.user_agent contains “cmsworld”) or (http.user_agent contains “copyrightcheck”) or (http.user_agent contains “datacha”) or (http.user_agent contains “demon”) or (http.user_agent contains “discobot”) or (http.user_agent contains “dotbot”) or (http.user_agent contains “dotnetdotcom”) or (http.user_agent contains “dumbot”) or (http.user_agent contains “emailcollector”) or (http.user_agent contains “emailsiphon”) or (http.user_agent contains “emailwolf”) or (http.user_agent contains “exabot”) or (http.user_agent contains “extract”) or (http.user_agent contains “eyenetie”) or (http.user_agent contains “feedfinder”) or (http.user_agent contains “flaming”) or (http.user_agent contains “foobot”) or (http.user_agent contains “g00g1e”) or (http.user_agent contains “gigabot”) or (http.user_agent contains “go-ahead-got”) or (http.user_agent contains “gozilla”) or (http.user_agent contains “grabnet”) or (http.user_agent contains “harvest”) or (http.user_agent contains “httrack”) or (http.user_agent contains “jetbot”) or (http.user_agent contains “jikespider”) or (http.user_agent contains “kmccrew”) or (http.user_agent eq “leechftp”) or (http.user_agent contains “linkextractor”) or (http.user_agent contains “linkscan”) or (http.user_agent contains “linkwalker”) or (http.user_agent contains “loader”) or (http.user_agent contains “masscan”) or (http.user_agent contains “miner”) or (http.user_agent contains “majestic”) or (http.user_agent contains “mechanize”) or (http.user_agent contains “netmechanic”) or (http.user_agent contains “netspider”) or (http.user_agent contains “ninja”) or (http.user_agent contains “octopus”) or (http.user_agent contains “pagegrabber”) or (http.user_agent contains “planetwork”) or (http.user_agent contains “postrank”) or (http.user_agent contains “proximic”) or (http.user_agent contains “purebot”) or (http.user_agent contains “pycurl”) or (http.user_agent contains “python”) or (http.user_agent contains “queryn”) or (http.user_agent contains “queryseeker”) or (http.user_agent contains “radiation”) or (http.user_agent contains “realdownload”) or (http.user_agent contains “rogerbot”) or (http.user_agent contains “scooter”) or (http.user_agent contains “seekerspider”) or (http.user_agent contains “siclab”) or (http.user_agent contains “sindice”) or (http.user_agent contains “sitebot”) or (http.user_agent contains “siteexplorer”) or (http.user_agent contains “sitesnagger”) or (http.user_agent contains “smartdownload”) or (http.user_agent contains “sosospider”) or (http.user_agent contains “spankbot”) or (http.user_agent contains “spbot”) or (http.user_agent contains “sqlmap”) or (http.user_agent contains “stackrambler”) or (http.user_agent contains “stripper”) or (http.user_agent contains “sucker”) or (http.user_agent contains “suzukacz”) or (http.user_agent contains “suzuran”) or (http.user_agent contains “teleport”) or (http.user_agent contains “telesoft”) or (http.user_agent contains “true_robots”) or (http.user_agent contains “turingos”) or (http.user_agent contains “vampire”) or (http.user_agent contains “webwhacker”) or (http.user_agent contains “woxbot”) or (http.user_agent contains “xaldon”) or (http.user_agent contains “yamanalab”) or (http.user_agent contains “zmeu”)

Which one do you think is better to have. I believe 1st rule is better?

From my perspective, neiether. Not so well optimized :thinking:

You have multiple “spider”, you could spare characters and remove all of them into contains “spider” :wink:

  • each Firewall Rule can contain up to 4096 characters as far as I know (at least what I used on a Free plan) x 5 Firewall Rules so make sure it’s an optimized …

If you want to allow only Googlebot and Bingbot, remove all which have “bot” in their name and modify it to be like contains bot and not Google bot or contains bot and not Bingbot :wink:

Helpful ones already posted here:

2 Likes

Thanks

Can you show us the setup rule for this?

would it be something like this

or would you use a rule here?

If it’s a Verified Bot, AND it’s one of these bots, then Allow.

  1. ALLOW (cf.client.bot and (http.user_agent contains "UptimeRobot" or http.user_agent contains "DuckDuckBot" http.user_agent contains "Googlebot" or http.user_agent contains "bingbot"))

And any other Verified Bots get blocked.

  1. BLOCK (cf.client.bot)

Will this work?

Most of this has already been covered. Either in the threads @fritex linked to, or this one which is very similar to yours:

thank you would this work?

@PC_Repair
yes it will work. but bingbot not Bingbot
and I added lower to include bot / Bot as @sdayman said and it worked

3 Likes

I apologize I’m not getting what you are saying

this is what I got.?

@PC_Repair
no you cant add bot / Bot
if you dont know how to use expressions. Just add another rule at the end with Bot

1 Like

@PC_Repair you might want to have a Paid plan (at least the Business) for using Regular expression feature in a Firewall Rules.

Source:

Helpful article:

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.