Robots.txt fetch failed by google webmasters


#1

Error:
Network unreachable: robots.txt unreachableWe were unable to crawl your Sitemap because we found a robots.txt file at the root of your site but were unable to download it. Please ensure that it is accessible or remove it completely.

webmaster is unable to fetch robots.txt. i kept firewall off now but still problem exist, even i deleted the robots.txt file from root but error exists…please help


#2

Can you fetch the sitemap and robots files directly from your origin?

curl --header "Host: www.example.com" http://192.0.2.123/robots.txt
curl --header "Host: www.example.com" http://192.0.2.123/sitemap_location.xml

Is there anything in your robots.txt which would block Google from fetching the sitemap? You could explicitly allow your sitemap by putting this line in your robots.txt
Sitemap: http://example.com/sitemap_location.xml

(replace the hostname with yours, the IP address with your Origins IP address, and the sitemap file with your file location)


#3

thank for your reply sir,

yes i can fetch the sitemap and robots files directly from origin.
in robots.tx i just added disallow:
and user-agent: *


#4

Can you paste in the content of your robots.txt? If it contains:
User-agent: * Disallow: /
you have told GoogleBot (and all other well behaved indexers) not to crawl your site.

The snippet below will allow all indexers (the trailing / is the important bit.)
User-agent: * Disallow:

I’m not sure the result of reversing those two lines as you say:
Disallow: User-agent: *

Try validating your robots.txt using the Robots Testing Tool.


#5

To resolve this issue

  1. Login to public_html folder
  2. Upload a new file called robots.txt

Add two lines

User-agent:*
Disallow:

#6

i just added 2 lines
User-agent:*
Disallow:

but still problem is same.
when is click on robots.txt tester it says: You have a robots.txt file that we are currently unable to fetch. In such cases we stop crawling your site until we get hold of a robots.txt, or fall back to the last known good robots.txt file. Learn more.


#7

[quote=“praptirehab, post:6, topic:2417, full:true”]
i just added 2 lines
User-agent:*
Disallow:

i just added 2 lines
User-agent:*
Disallow:

but still problem is same.
when is click on robots.txt tester it says: You have a robots.txt file that we are currently unable to fetch. In such cases we stop crawling your site until we get hold of a robots.txt, or fall back to the last known good robots.txt file. Learn more.


#8

but when i remove the 2 dns from my host.
.
ali.ns.cloudflare.com
west.ns.cloudflare.com

then every thing works fine.


#9

@praptirehab Are you using WordPress or any other platform?


#10

no, I am using Apache.


#11

Okay fine, then you should check HTTP header response after emulating with GoogleBot.


#12

i checked it. it is showing 200 ok.


#13

Can you paste in a complete copy of your robots.txt file, or provide the name of your site?

When you say you added two lines, you may have something like the following, which blocks all robots. The last two lines will be ignored.

User-agent: * Disallow: / User-agent: * Disallow:


#14

Maybe try the suggestion at this article:

If all methods fails, I would recommend opening a ticket with Cloudflare Support.


#15

Thank you very much for your all replies. at present i can see that google is fetching the robots.txt and no error is showing in sitemap.xml testing.

I removed the following lines from my .htaccess file and from then fetching is working.

# Make sure proxies don’t deliver the wrong content
Header append Vary User-Agent env=!dont-vary

i am not sure but i think this would be the reason of error.

once again thanks to all for all your support.


Robots.txt fetch failed. Please help!