How to block (serve as 404) but still allow Google to crawl the page

Does anyone know the rule needed to block my sites /robots.txt file? Making it public seems like giving negative SEO efforts the blueprint for a workaround or new attack. Does hiding this page prevent Google from Crawling it? I am new:)

What I use to allow only Googlebot to access the sitemap.xml and robots.txt, sharing my below post which contains an Firewall Expression and a screenshot of allowing only Googlebot + Bingbot to robots and sitemap using a Firewall Rule:

Good read too here:

This is awesome! Should I put the / in the URL path field?

I just implemented it, and it didn’t work. Is this is not compatible with the Yoast SEO plugin? Or, maybe I did something wrong?

Yoast shouldn’t impact anything. Maybe start with something simpler such as:

URI Path equals /robots.txt
User agent does not contain Googlebot
Then take action..

That’s not stopping people spoofing Googlebot as it doesn’t cover the Google ASN but it’s a good start and once you’ve got that working, you can add in the extra clauses if you still feel the need.

Unfortunately, removing the (AS Num) didn’t resolve the issue. However, I played around with the URL setting (URL Path, URL, & URL Full.) The URL Full seemed to work. Any issues with using that instead or the others?

This is what I came up with. Do you see any issue with this?

After testing it looks like (URL Full) blocks the entire site. After the (URL Path) didn’t work either, I implemented (URL) and the commands and it seems to finally work. I was wondering were you happened to get the (AS Num) codes for Google and Bing?

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.