Wrong File Ban - Bot Fight Mode

Should these file be banned by any BOT prevention method?

/robots.txt
/ads.txt

If you enable bot fight mode with static file protection these file also get blocked? Should these files ever be blocked for access by any protection?

Those are typically meant to be scraped by legitimate bots, unless the client behaves poorly then there is no reason to stop them fro checking those pages.

1 Like

Thats what I am saying why do cloudflare bot prevention block those two file?

Do you have the logs of those blocks?

Bot Fight Mode is designed to block IaaS providers e.g. AWS, Azure and GCP. Are you expecting bad bots from them to crawl your robots.txt too?

Suppose Amazon Ad bot want to read robots.txt. I have mentioned that it in robots.txt not to scan. But as it is not able to read robots.txt it will keep hitting my website (cloudflare server). Some bot do follow robots.txt file

An Example

Can you disable BFM for static resources?

That doesn’t appear to be a legitimate Amazon adbot instance, so it’s blocked. Bot Fight mode block definite bots is blocking malicious bots. I don’t think we classify Amazon’s legit bot as malicious…

1 Like

Yes you do. Amazonadbot legitimate is blocked by bot fight mode.

According to Amazon’s AdBot documentation, a lookup of that IP address should return a value under the AmazonAdBot subdomain. It does not.

→ Amazon AdBot

2 Likes

$ host 72.21.217.115.amazonbot.com
72.21.217.115.amazonbot.com has address 72.52.10.14

So it seems it is a genuine amazonadbot which is blocked by cloudflare.

C:>nslookup 72.21.217.115.amazonbot.com
Server: dns.google
Address: 2001:4860:4860::8888

Non-authoritative answer:
Name: 72.21.217.115.amazonbot.com
Address: 72.52.10.14

Not sure whether you really know what you are doing. The documentation already shows you the example usage of the command:

Also,

This is not Amazon IP address. This IP address belongs to Akamai.

2 Likes

I am not talking about this IP.

Of course I’m talking about that IP because the query is returning Akamai IP.

The instructions asked you to execute host 72.21.217.115 but you added amazonbot.com at the end instead.

1 Like

These are the instruction

Even cloudflare do not return reverse PTR queries I think.

Forward dns query return this.

nslookup crawler-72.21.217.115.amazonbot.com
Server: dns.google
Address: 2001:4860:4860::8888

Non-authoritative answer:
Name: crawler-72.21.217.115.amazonbot.com
Address: 72.52.10.14

The failure of the PTR lookup means this is not a valid Amazon bot.

dig totally-fake-crawler-72.21.217.115.amazonbot.com +short
72.52.10.14
dig crawler-1.1.1.1.amazonbot.com +short
72.52.10.14

Both the totally fake crawler entry and an entry using 1.1.1.1 which Amazon doesn’t own or manage return results. The first step in Amazon’s verification is to perform a successful PTR lookup. Since the lookup fails, no other steps are required.

If you are confident that address is a bot for Amazon, please contact them and let them know their bot is misconfigured.

3 Likes

I think the discussion is getting into wrong direction.

I am asking even if it is a fake bot should it be allowed to read robots.txt and ads.txt file?

Are you asking for an opinion, or are asking if Cloudflare’s bot protection extends to those two files?

1 Like

There is no reason for a fake bot to be able to read your page at all, however, it wouldn’t surprise me if some paths that are naturally harmless (such as a .text file) were simply skipped to save computing costs.

1 Like