I’ve been using the “known bots” rule to make sure that we don’t inadvertently block known, friendly bots (in particular, Google).
I wondered, does Cloudflare tie known bots to their respective known IP address ranges? I’m guessing they must, otherwise an attacker could simply spoof their user-agent as “GoogleBot” or “Yandex”, etc and get around firewall rules easily.
But in addition to that, what if a Bot’s IP address range changes (or they introduce new IP ranges)? I just wondered how it was possible for the folks at Cloudflare to stay on top of this in a reliable way?
I guess what I’m worried about is waking up one morning to find our websites all vanished from Google because it’s been inadvertently blocked.