Using a WAF to block robots.txt from humans

I use this approach, works fine, no issue.

However, remember that bots might sniff aroud and find your sitemap.xml file, therefrom grab and crawl things from it.

Might have to consider allowing only Google, Bing and your own IP address to the .xml files too? Could be.

Can’t say in general, but I have this practice on a news site portals since I’ve catched fake Googlebots even from my own country or other countries, trying to grab and steel the content.

No, just make sure to write the Firewall Rule correctly to not block real Googlebot from accessing your robots.txt and sitemap related files, which would result in some issues with indexing, crawling and SEO.

1 Like