Add 7G firewall list to WAF

Hi there.

I was hoping someone can show me how to add this 7G fireall to WAF

(360Spider|acapbot|acoonbot|ahrefs|alexibot|asterias|attackbot|backdorbot|becomebot|binlar|blackwidow|blekkobot|blexbot|blowfish|bullseye|bunnys|butterfly|careerbot|casper|checkpriv|cheesebot|cherrypick|chinaclaw|choppy|clshttp|cmsworld|copernic|copyrightcheck|cosmos|crescent|cy_cho|datacha|demon|diavol|discobot|dittospyder|dotbot|dotnetdotcom|dumbot|emailcollector|emailsiphon|emailwolf|exabot|extract|eyenetie|feedfinder|flaming|…

Keeping in mind. I use RSS for Google Instant indexing my news content. I read somewhere this can block some RSS also

I also need to allow 3rd part seo tools, ahrefs, moz sem. Will this firewall list block them too?

I found some docs. But not how to exacly set it up correctly

Greetings,

Thank you for asking.

If you are using a Cloudflare and a proxied :orange: DNS hostname (domain.com, www.domain.com …), it could be achieved by creating a Firewall Rule at Cloudflare Dashboard with which you can block all the requests where the User-agent contains the string of the ones you want to block at your domain/Website.

Furthermore, we might have to modify the expression a bit as far as Regular Expressions can be used on a higher Paid plan.

Nevertheless, we would be limited by the 4096 characters per a Firewall Rule we can use. Therefore, we might have to split it into two or more, we’d have to see.

Kindly, may I point you to the step-by-step instruction from link below how to manage Firewall Rules at Cloudflare dashboard (the referenced link includes pictures for better understanding, navigation and help):

Example of the Firewall Rule to block requests/traffic where User-agent contains a string or a part of it to your Website in picture:

Example expression (not full):
(http.user_agent contains "360Spider") or (http.user_agent contains "acapbot") or (http.user_agent contains "acoonbot") or (http.user_agent contains "ahrefs") or (http.user_agent contains "alexibot") or (http.user_agent contains "asterias") or (http.user_agent contains "attackbot") or (http.user_agent contains "backdorbot") or (http.user_agent contains "becomebot") or (http.user_agent contains "binlar") or (http.user_agent contains "blackwidow") or (http.user_agent contains "blekkobot") or (http.user_agent contains "blexbot") or (http.user_agent contains "blowfish") or (http.user_agent contains "bullseye") or (http.user_agent contains "bunnys") or (http.user_agent contains "butterfly") or (http.user_agent contains "careerbot") or (http.user_agent contains "casper") or (http.user_agent contains "checkpriv") or (http.user_agent contains "cheesebot") or (http.user_agent contains "cherrypick") or (http.user_agent contains "chinaclaw") or (http.user_agent contains "choppy")

Make sure the rule ist the 1st from above on the Firewall Rules list.

Nevertheless, a good example already exists. Consider blocking some of the known “bad user-agents”, “crawlers” or “bad ASNs” using below posts:

Yes, above Firewall would block them.

But, we can create another one and make sure that one is with the action “Allow” and above the Firewall Rule which is blocking bad bots - or exclude them and remove from the “blocking” firewall rule.

Nevertheless, regarding “good bots” there is a list of known bots which we can define to bypass them and allow using cf.client.bot field as described on the following article from below:

1 Like

So can I just copy this into the expression section and select block without putting all this in manaully ?

I am little confused which one to use here.

Which one can a copy into the expressions that wont block ahrefs,moz, ect ect ?

I would also like to block RSS scrapers, but as I am using RSS for my google news and bing news instant indexing I can’t block those either.

Sorry, im not a tech engineer, programmer. Just a tech journalist trying to protect my asset. Thanks

Thank you for asking.

Yes, you can.
Add the others individually per need.
Or, you might need help to create it, and like me or someone else to make those Firewall Rules and post here using the mentioned 7G Firewall list, therefore split into the multiple rules (as there might be a lot of bots) and instruct what and where to click so you’d be sure? :thinking:

In case you find them in the expression, kindly remove them if you’d be using “block” as an action to block the RSS scrapers and bad bots in that one Firewall Rule.

I’d suggest to even split “good” in one rule with the action “allow”, then have the second rule where you’ve list the “bad” ones and use “block” action.

I am not seeing these in the expression unless I am mssing something or not reading it right. Thanks very much for the information provided.

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.