We noticed a sudden spike of traffic in the past couple of weeks. We though it was legit till we noticed it was only hitting the homepage and it’s all coming from the US. While it would be nice to have this amount of traffic, we felt that it’s only artificial and could be a bot attack.
Our hosting partner confirmed it coming from the US and one of our analytics partners told it the UA was all Windows using Chrome.
But there’s one piece of evidence that told us it was indeed some sort of bot. We checked Google Analytics and under the Audience by Technology section, there were screen resolutions of 0x0 and the user agent contains Windows and Chrome.
Is there a way to block visitors via WAF if users have a screen resolution of 0x0 and maybe combine with additional rules (e.g. UA contains Windows, Chrome)?
A temporary solution is we captcha US traffic, going to a page with a UA that contains Chrome. A downside is that users who don’t have patience for captchas might not return.
We filtered first all the screen resolutions of 0x0. So all of them have user-agents containing Windows and Chrome. Google Analytics does not have IP addresses for privacy reasons. The spike in traffic are completely random. If you look at the last graph I posted, there’s a shadow graph behind it. That represents the previous week’s traffic.
We’ll try tomorrow the JS Challenge option. If we still see screen resolutions of 0x0, then we’ll have to go back with Captcha.
My worry with JS Challenge is that the bot might be using a JS compatible headless browser like headless Chrome:
you need to look at your server logs and find info about this requests… it may be a legal bot you don’t want to block, your server probably have nginx or apache installed and probably has requests log turn on.
if you don’t have logs you can use cf worker to send logs containing all relevant info and analyze them
Alright, I am afraid they dont seem to have much in common which could establish a hard block. There is neither a common pattern in the user agents (apart from Chrome of course) nor in the origin of the requests (IPs and AS’es wildly differ).
What I noticed though is it seems these connections come from actual ISPs and not datacentres. That would speak against traditional crawler or scrapers and more for a sort of botnet.
If they still visited subpages I’d even consider assuming they might be genuine visitors who have some sort of privacy extension installed which fakes the screen resolution, however if you say they only request the main page and stop there we can probably rule that out.
If all of that does not work you could fall back to some server-side implementation where you actually check for the resolution.
just to add a bit to what sandro said… there is still some things you need to check, maybe they have common refer? maybe they have common x_forwarded_for? does each ip hit your site in specific interval? how many times per minute\hour? how many ip actually hitting you with this behavior? if you will examine the full logs a little more you could find some pattern.
you can for example add cookie in js to visitor with 0x0 screen and block them with the firewall by the cookie