Superbot fight mode not blocking bots not in the verified list

I have been waching my local (live) firewall for sometime now (wordfence) and I am still getting many bots coming to my site that are not in the approved list.

It seems that wordfence is logging these. However, it was of my understanding that any bots not in the Cloudflare verified list should be getting blocked by superbot fight mode

Example of bots not verified landing on my site.

8/6/2022 12:58:20 PM (14 minutes ago)

IP: 188.138.1.225 Hostname: atlantic389.dedicatedpanel.com

ias-ir/3.1 (+https://www.admantx.com/service-fetcher.html)

I am not sure why Cloudflare is not stopping these. I have the superbot fight mode turned on and a paid Cloudflare account.

Could it be I have not set something up correctly?

I am also seeing possible malicuous spidering heading for odd URLs strings on my website that wordfence is blocking for me, But as far as I am aware, I should not be seeing these local logs at all because Cloudflare should be blocking these automatically for me, I have all my sec turned up quite high on Cloudflare.

EG.

I have both wordfence premium and running a 7G firewall locally

And the unverified bots keep on coming (why is Cloudflare not blocking these for me?

Kindly, if I may share the article from below related to the Super Bot Fight Mode including a suggestion to Allow the IP? :thinking:

While using the Wordfence plugin, in the options don’t forget to select and choose “ CF-Connecting-IP ” option (Use the Cloudflare “CF-Connecting-IP”). Do not forget to save to apply the changes.

This bot might be on the verified bot list here:

Which Cloudflare Plan are you using?

Using a Firewall Rule, you can block those which aren’t being the “good” ones and verified, if so :thinking: Nevertheless, you can block some other and much more with a Firewall Rule.

What score do you see in Security → WAF → Overview for those requests? :thinking:
You can try manually blocking them, if so.

A verified bot is not necessarily good or bad, meaning Cloudflare might not have to block the ias-xx bots if they aren’t doing some harm :thinking:

Personally, learn to use the rule pages, and you will stop quite a bit more than Wordfence, the botfight mode is samples I think, plus it breaks so many good things, just do some forum searches its very frowned upon unless you upgrade to enterprise.

Example of just a fraction of my rules. If anyone copies these, they may break your site be sure to test, but more or less will work, its called common sense and log review not apply and forget.

(http.request.uri.path contains “.aspx”) or (http.request.uri.path contains “.jsp”) or (http.request.uri.path contains “.zip”) or (http.request.uri.path contains “.rar”) or (http.request.uri.path contains “.mysql”) or (http.request.uri.path contains “.pdf”) or (http.request.uri.path contains “.exe”) or (http.request.uri.path contains “.asa”) or (http.request.uri.path contains “.ascx”) or (http.request.uri.path contains “.axd”) or (http.request.uri.path contains “.backup”) or (http.request.uri.path contains “.bak”) or (http.request.uri.path contains “.bat”) or (http.request.uri.path contains “.cmd”) or (http.request.uri.path contains “.csproj”) or (http.request.uri.path contains “.dat”) or (http.request.uri.path contains “.dbf”) or (http.request.uri.path contains “.dll”) or (http.request.uri.path contains “.ini”) or (http.request.uri.path contains “.shtml”) or (http.request.uri.path contains “.php”) or (http.request.uri.path contains “.sql”) or (http.request.uri.path contains “.7z”) or (http.request.uri.path contains “.arj”) or (http.request.uri.path contains “.gz”) or (http.request.uri.path contains “.tbz”) or (http.request.uri.path contains “.debug”) or (http.request.uri.path contains “.txt” and not http.request.uri.path contains “ads.txt” and not http.request.uri.path contains “robots.txt”) or (http.request.uri.path contains “.tar.gz”)

Example, break any unused request methods

(http.request.method in {“PUT” “OPTIONS” “DELETE” “PATCH” “PURGE”})

Why not challenge old firefox user agents that scrapers forget to update?

(http.user_agent contains “Firefox/2” and not cf.client.bot) or (http.user_agent contains “Firefox/3” and not cf.client.bot) or (http.user_agent contains “Firefox/4” and not cf.client.bot) or (http.user_agent contains “Firefox/5” and not cf.client.bot) or (http.user_agent contains “Firefox/6” and not cf.client.bot) or (http.user_agent contains “Firefox/7” and not cf.client.bot and not ip.geoip.asnum in {54183}) or (http.user_agent contains “Firefox/8” and not cf.client.bot)

Do not add 9, 9 is the linux long term release for enterprise, while mainstream in the 100s.

Same for Chrome

(http.user_agent contains “Chrome/2” and not cf.client.bot) or (http.user_agent contains “Chrome/3” and not cf.client.bot and not http.user_agent contains “[Hybrid Advertising]”) or (http.user_agent contains “Chrome/4” and not cf.client.bot) or (http.user_agent contains “Chrome/5” and not cf.client.bot) or (http.user_agent contains “Chrome/6” and not cf.client.bot) or (http.user_agent contains “Chrome/7” and not cf.client.bot) or (http.user_agent contains “Chrome/8” and not cf.client.bot) or (http.user_agent contains “Chrome/9” and not cf.client.bot)

Outright destroy commonly used useragents, user-agent websites and your logs are a treat.

(http.user_agent contains “wp_is_mobile”) or (http.user_agent contains “Siege”) or (http.user_agent contains “Nmap”) or (http.user_agent contains “nmap”) or (http.user_agent contains “localhost” and not cf.client.bot) or (http.user_agent contains “Office” and not cf.client.bot) or (http.user_agent contains “office” and not cf.client.bot) or (http.user_agent contains “Word”) or (http.user_agent contains “word”) or (http.user_agent contains “Scrapy”) or (http.user_agent contains “scrapy”) or (http.user_agent contains “Ruby”) or (http.user_agent contains “ruby”) or (http.user_agent contains “HTTrack”) or (http.user_agent contains “python” and not cf.client.bot) or (http.user_agent contains “Python” and not cf.client.bot) or (http.user_agent contains “Go-http” and not cf.client.bot) or (http.user_agent contains “Apache” and not cf.client.bot) or (http.user_agent contains “apache” and not cf.client.bot) or (http.user_agent contains “curl”) or (http.user_agent contains “wget”) or (http.user_agent contains “Wget”) or (http.user_agent contains “Curl”) or (http.user_agent contains “CURL”) or (http.user_agent contains “WGET”) or (http.user_agent contains “simplepie”) or (http.user_agent contains “SimplePie”) or (http.user_agent contains “newspaper”) or (http.user_agent contains “Statically”) or (http.user_agent contains “statically”) or (http.user_agent contains “https://statically.io/screenshot/”) or (http.user_agent contains “HTTrack”) or (http.user_agent contains “BlackWidow”) or (http.user_agent contains “AvantGo”) or (http.user_agent contains “RSS”) or (http.user_agent contains “rss”) or (http.user_agent contains “feed”) or (http.user_agent contains “Feed”) or (http.user_agent contains “FEED”) or (http.user_agent contains “Rss”) or (http.user_agent contains “News”) or (http.user_agent contains “news”) or (http.user_agent contains “cURL”) or (http.user_agent contains “Snoopy”) or (http.user_agent contains “urlgrabber”) or (http.user_agent contains “Zend”) or (http.user_agent contains “PHP”) or (http.user_agent contains “read”) or (http.user_agent contains “Read”) or (http.user_agent contains “READ”) or (http.user_agent contains “Akregator”) or (http.user_agent contains “Liferea”) or (http.user_agent contains “Reeder”) or (http.user_agent contains “Snarfer”) or (http.user_agent contains “Winds”) or (http.user_agent contains “TMM Crawler”) or (http.user_agent contains “MauiBot”) or (http.user_agent contains “Monibot”) or (http.user_agent contains “DnBCrawler”) or (http.user_agent contains “Headless”) or (http.user_agent contains “HEADLESS”) or (http.user_agent contains “headless”) or (http.user_agent contains “Diffbot”)

I got so many more rules, literally, I eat most for breakfast, yes I can be hacked, but can I with a little bit of research, log review stop nearly a massive amount of scrapers, malicious scanners and useless third-party services wasting my bandwidth using a few rules on cloudflares page? Yes.

Also, using cleantalk and reviewing ASN ranges is a big one too, challenge/block then allow certain passthrough if required, gee-whiz its fun watching the bots hit the wall.

If anyone uses the few samples I posted, remember to review and remove as what works on my site may do some unwanted blocking/challenges on yours, common sense and review the logs, and more than likely you will add more to these anyway.