Requesting Assistance with Googlebot Crawling Issue due to Firewall Security Measures

Dear Cloudflare Community

I hope this message finds you well. I am reaching out to you regarding an issue I am experiencing with my website, which is protected by Cloudflare.

Recently, my website came under a DDoS attack, and to mitigate the impact, I implemented several firewall security measures. These measures include blocking direct traffic and challenging certain user agents and IP addresses that were identified as suspicious or malicious.

While these security measures have been effective in protecting my site, I have encountered an unintended consequence. Googlebot, the crawler used by Google to index web pages, is being blocked and unable to crawl my website, despite taking measures to prioritize Google’s ASN as the number one priority in my firewall rules.

Upon investigation, I found that even with Google’s ASN set as the highest priority, it is still being blocked by the secondary priority IP blocker rule. As a result, my website is experiencing a negative impact on search engine visibility and indexing.

I understand the importance of maintaining robust security measures to safeguard my website against malicious activities. However, it is equally crucial for my site to be accessible to legitimate search engine crawlers like Googlebot.

I kindly request your assistance in resolving this issue so that Googlebot can successfully crawl my website while maintaining the necessary security measures against DDoS attacks. Any guidance, suggestions, or adjustments to my firewall rules that would allow Googlebot to access my site without compromising security would be greatly appreciated.

Please let me know if there are any further details or logs I can provide to help address this matter effectively. I am eager to work collaboratively to find a solution that ensures both the security of my website and its accessibility to legitimate search engine crawlers.

Thank you for your attention to this matter, and I look forward to your prompt response.

What rules do you currently have in effect?

I’d start by reviewing whatever rule you have based on Google’s ASN. Google does not have, to the best of my knowledge, an ASN linked exclusively to Googlebot, and allowing a Google ASN make it easy for hackers to use Google services (such as Google Translator) to attack your website.

In general, rules based on behavior — such as requests for /non-existing-plugin.php, or with a query string like /?eval(some) — should be blocked regardless of origin, while rules based on visitor’s profile (from country XYZ, or from IPs/ASNs such and such, should contain an exclusion to “known bots”:

When incoming requests match:
...
...AND Known-Bots OFF

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.