One of my websites is being (sort of) DDoS-ed by thousands of request per minute coming from “2a06:98c0:3600::103” and a User-Agent equal to that of Google’s bot. I can’t block it based on User-Agent due to that.
Also the website is Proxied behind Cloudflare. So I can’t use block it through iptables or routing to the blackhole.
I tried blocking it by WAF rules matching the IP. e.g. (ip.src eq 2a06:98c0:3600::103) or (http.x_forwarded_for eq "2a06:98c0:3600::103")
Unfortunately, that’s not working. My server is still receiving all the abusive traffic. Any idea how I could block all traffic from that IP?
Yes, I’m at Nginx level (using set_real_ip_from). E.g. set_real_ip_from 2a06:98c0::/29;
All requests come with this User Agent: (Linux; Android 6.0.1; Nexus 5X Build/MMB29P). If I filter the logs by that UserAgent, I can see traffic from Google’s IPs.
I’ve tried the following rule (ip.src eq 2a06:98c0:3600::103) or (http.x_forwarded_for eq "2a06:98c0:3600::103") or (cf.worker.upstream_zone ne "") - i.e. catch all traffic that’s coming from a worker.upstream_zone; however, that has no effect. The aggressive crawler (or whatsoever) is still hitting hard my website.
Is there a more specific rule that I can use to block all traffic from workers (given that I’m not using them)?
Hey @ncano, could you please be more specific how I can block that ip by using cf.worker.upstream_zone. As I shared earlier, I tried with cf.worker.upstream_zone ne "" (with the idea to block ALL worker zones); however, that doesn’t have any effect.
I would suggest ditching ip.src and all the other conditions and block based on a filter similar to this instead: (cf.worker.upstream_zone != "" or cf.worker.upstream_zone != "[MYZONENAME.COM]")
Got a site with the same exact symptoms: same IP address, same Android Nexus user agent, same claim that it is GoogleBot. And it is very aggressive. It sometimes accounts for 20% of the traffic on the site.
Great question. In a nutshell, zone can be a subdomain or assume an ownership of multiple domains, so in some situations the word “domain” doesn’t describe this concept well enough.
If you do NOT require any Cloudflare Workers to send HTTP requests to your site, use the expression:
cf.worker.upstream_zone ne ""
If you need Cloudflare Workers, from your own domain / zone, e.g. “example.com”, to be able to access your site, use the expression:
cf.worker.upstream_zone ne "" and cf.worker.upstream_zone ne "example.com"
If you need Cloudflare Workers, from all three different zones, e.g. “example.com”, “example.net” and “example.org”, to be able to access your site, use the expression:
cf.worker.upstream_zone ne "" and not cf.worker.upstream_zone in {"example.com" "example.net" "example.org"}
Using one of these expressions as your WAF rule, with the action BLOCK should be working to block undesired access from Cloudflare Workers (or, at least, those that are not your own).
Seems to me like @ncano accidentally shared the expression with:
Note that the 418 return code is from custom code we added to the application, to detect that IP address, and return a “I am a teapot” error. That way, we can parse the logs for that specific pesky crawler.
Considering the above test, which was done through Cloudflare Workers Playground, it seems to me like the rule(s) are working perfectly fine, in order to block Workers.
Therefore, I’m getting further, towards another question:
Are you able to share all of your rules, including the order of those?