How to stop Cloudflare from hitting my site?

#1

I didn’t ask for Cloudflare. My hosting company claims to to have done it either.
But Cloudflare regularly hist my site very hard, doing deep queries that chew up too much CPU.

How can I get Cloudflare to take my site off its likst?

#2

Cloudflare doesn’t hit sites not signed up with Cloudflare. What signs do you see that point to Cloudflare? I’m attaching a link to their IP address ranges.

If you post your domain name, we can take a look.

1 Like
#3

Signs? THeir bot agent string is pretty clear:
Mozilla/5.0 (compatible; Cloudflare-AMP/1.0; +https://amp.cloudflare.com/doc/fetcher.html) AppleWebKit/534.34

IP range: I don’t see thses on your list, but they all come from that bot. Hitting hard, 4 to 5 probes per second.
162.158.71.27
162.158.64.55
162.158.64.55
162.158.68.214
162.158.68.214
162.158.69.5
162.158.68.214
162.158.67.70
162.158.67.70
162.158.67.70
162.158.64.66
162.158.65.19
162.158.65.19
162.158.70.132
162.158.70.132
162.158.64.55
162.158.64.55
162.158.64.66
162.158.64.55
162.158.64.66
162.158.64.66
162.158.64.66
162.158.64.66
162.158.64.66
162.158.71.27
162.158.64.66
162.158.65.19
162.158.64.55
162.158.64.41
162.158.71.27
162.158.68.214
162.158.68.214
162.158.64.55
162.158.68.214
162.158.68.214
162.158.67.70
162.158.64.66
162.158.65.19
162.158.65.19
162.158.64.66
162.158.64.66
162.158.64.41
162.158.70.132
162.158.67.70
162.158.64.55
162.158.65.19
162.158.64.66
162.158.64.41

#4

Those are Cloudflare IP addresses. /15 is all of 158 and 159. And your logs are showing your domain in the request (they’re valid URLs)?

Again, if you post the domain (or a URL they’re hitting), we can take a closer look.

#5

The domain is
Marshalls.org

#6

I don’t see any reason Cloudflare would hit that domain. The only thing I can think of is someone has misconfigured their Cloudflare setup using your IP address.

Since you’re not a Cloudflare customer, maybe @cloonan can suggest what to do. Hopefully figure out why Cloudflare is hitting your IP address.

One solution that would fall to you would be to use a firewall to block 162.158.0.0/15
But it would be nice if Cloudflare can track this down.

1 Like
#7

I think +1 to the misfired ip, will investigate.

3 Likes
#8

Appears to be our AMP crawler, https://amp.cloudflare.com/doc/fetcher.html. I looked at a few tickets and would gather someone is linking to your site and hence we’re crawling. The firewall approach @sdayman suggested is the best route to block.

2 Likes
#9

Why is the AMP crawler hitting non-Cloudflare sites? From the description of the AMP Crawler, if someone with a site on Cloudflare links to a non-Cloudflare site, the crawler will still hit it.

Will the AMP crawler stop crawling a site if robots.txt tells it not to?

#10

The last I saw it does not respect robots.txt and I’ve not seen an update. And is odd as the marshalls.org zone does not show as ever on cloudflare. I think the best way to kick it into a ticket is to email support AT cloudflare DOT com, the system may kick back the ticket immediately or after a few minutes, but share the ticket number and I’ll re-open and get to the right folks. Please reference this thread, How to stop Cloudflare from hitting my site?

3 Likes
#11

Thanks!

I have had this entry in my robots.txt for over six month, but it hasn’t helped:

User-agent: Cloudflare
Disallow: /

The crawler isn’t simply visiting my home page - that wouldn’t be an issue. But the behavior is consistent with Cloudflare having found some links to my applications on other sites (in my organization), and having the agent submit those queries. I also see that behavior with Google and some other search engines, but they have a much less aggressive hit pattern.

#12

@cloonan did mention Cloudflare does not take robots.txt into account at this point.

Nobody has added your domain to Cloudflare, so the question still is why Cloudflare would access it in the first place but that is something for Cloudflare to clarify. In the meantime I’d implement the firewall approach @sdayman mentioned and block that network.

#13

The AMP Crawler docs say that if a Cloudflare user’s site has AMP, and links to other sites, the crawler will follow those links to see if the destinations are AMP as well. Presumably to convert those links to AMP as well.

1 Like
closed #14

This topic was automatically closed after 14 days. New replies are no longer allowed.