So it should be covered by the known bots but I only think it is a few Google Bots. I need a few others but I can’t just use a bot name or user agent since those are easily spoofed.
Also, I was not seeing any of them go through in the Firewall logs before I allowed the ASN15169 through the WAF.
Some Google bots are still getting blocked then the same bot will be allowed to pass through a second later, then blocked, etc.
I would like to know if this could be a security issue? I assume not since it is coming straight from Google’s managed server.
Also, are there any disadvantages to allowing Google’s ASN through? I am seeing 16+ thousand Google bots bypassing the firewall; does this mean Argo traffic is serving the bots as well? For some reason, our Argo usage went from an average of 4GB/day to 10GB/day literally overnight but this was a month or so after I set the ASN Firewall Rule. Would that be caused by any Google Bot?
You will have to check your firewall logs because some bots might be blocked - you will see the first rule has a few exceptions (regex so change accordingly if your account is not allowed to use it) for bitly, zapier, feedburner, feedly. Also, I have a rule before those to allow things like updown.io (using a custom user-agent) and other use cases. Make sure to allow cf.bots before blocking with these rules.
Again, use with caution. Your use may be different than anyone else’s.
Regex (“Regular Expression”) is that bit with the test for the user agent. You can remove that from the rule I’ve listed above because only Business and Enterprise can use Match in the the rule expression. Instead you can add those to the Allow rule you’d have created for known bots):
(http.request.uri.path contains "bitlybot") or (http.request.uri.path contains "zapier") or (http.request.uri.path contains "zapier") or (cf.client.bot)
One more question about this in regards to some things I have been reading about.
Why do you block those servers besides known good bots? Wouldn’t that block legitimate traffic from humans? Or are those servers mainly used by other bad or unknown bots or other malicious intent that the firewall may not block without those rules?