Bytespider is bypassing CF despite rules being applied. What’s also interesting is the analytics are showing no activity yet the host, Kinsta, says the site was accessed by Bytespider indeed.
What steps have you taken to resolve the issue?
Reached out to support to confirm it was Bytespider. It was indeed
I’d suggest and consider replacing the “equals” with the “contains” operator to catch it with your Custom Firewall Rule, since it might have something more in the “user-agent” name than just Bytespider to immitate real user with a real Web browser (even headless version supporting JS, etc.)
Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.1511.1269 Mobile Safari/537.36; Bytespider
Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.7997.1233 Mobile Safari/537.36; Bytespider
Furthermore, since I am not ackonwledged enough if it’s actually doing any harm (except the “only” crawling, scrapping, etc. which is also in a “grey” area of the activities) which would result in Cloudflare’s ML to block it by default or challenge, or something else.
Hey Fritex, thanks. My other concern was why was there no activity showing? I’ve had this rule in place for months and it’s always been set to zero. Thanks
My 2 cents because the user-agent name string contained more than just “Bytespider”, therefrom the Custom Firewall Rule couldn’t catch an HTTP request coming from the tool, device, bot which in it’s name contains only the “Bytespider” alone without any other part (and characters).
Could be I am wrong, however from my experience this is mostly the thing of how to block them and have more accurate Firewall Rules setup for better security and protection.
Once you successfully catch it, you can see the IPs comming from and requests trying to dig of your Website. Aftermath would be to either block the whole ASNs once you determine it via the IPs, which would cut the traffic coming from it to your Website.
Some articles which I’ve found in the meantime after writing a reply:
Not sure I follow, what do you mean by because the name string contained more than just bytespider? That’s all that’s in there is bytespider for its rule and bytedance for it and so on. That’s the proper way to block multiple bots at the same time without having to write 100 rules.
If we look at the example from below, of the user-agent, it does contains Bytespider (can be seen and read at the end) visits your website with your Firewall Rule where you are using “equals” instead of “contains”, will never catch such requests and would always pass through, ending you’d nevers see any “bytespider” being blocked/challenged with your Firewall Rule with “equals” operator:
Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.1511.1269 Mobile Safari/537.36; Bytespider
So, the HTTP request comming to your Website passes the rule.
If you change it to “contains”, then if there’s any “bytespider” seen in such long user-agent name, it will block it as we want it.
Correct, only try to use “contains” as the operator
In my example from below, I catch a lot of bots with “contains”, therefore a lot of combinations are covered with “contains”.
Otherwise, I’d have to know all of the combinations and full user-agent names as they exist or will be created in future.