My sites were being excessively crawled by bots


#1

My server memory used is spiking up and I request my hosting company to look into it.
They told me this:

* We noticed that your sites were being excessively crawled by bots:

[domlogs]# grep -Pc “[Bb]ot” $(find . -maxdepth 1 -type f -size +300k -print) ./mixitrestaurant.slickboston.com-ssl_log:734 ./akwaabaensemble.com:347 ./9taste.com-ssl_log:559 ./mypetfamilytree.com:463 ./frankanthonysmarket.slickboston.com-ssl_log:506 ./icmechanicalservices.slickboston.com-ssl_log:538 ./terramia.slickboston.com-ssl_log:898 ./newmedicineonline.slickboston.com-ssl_log:4850 ./marcmaccini.com:607 ./vasirefrigeration.slickboston.com-ssl_log:746 ./antico.slickboston.com-ssl_log:907

How can I block this bot attacks. Help please


#2

Bots tend to do this at times. You can exclude all known bots from crawling your site with a Firewall Rule:


#3

Or block the IP with an IP Access Rile if it’s one bad actor or IP range


#4

Sometimes bots just go through the ipv4 space and find your host. In addition to what was already suggested. I also recommend to use Authenticated-Origin-Pulls (but only if you expect all http traffic to come from cloudflare) https://support.cloudflare.com/hc/en-us/articles/204899617-Authenticated-Origin-Pulls. Actually never mind, you probably don’t have access to the necessary configuration files if you are on a shared hosting provider.

I read it as Access Rifle, I guess it can also be used to defend against bots… :wink:


#5

if you can allow to pay, the only long time solution to my knowledge is to enable rate limiting


#6

I think I fix it and I am not sure it was the bot attack. Anyways I did challenge traffic from some countries: Cambodia, China, Russia, Ukraine etc for all my accounts. That seems to work and it is easy to do it.
However in my WHM I noticed that apache_php_fpm service was the one that using most of memory so I went and downgrade those domain names from php 7.2 to 7.1 …
I also noticed that some domain names google console plugin was disconnected or a domain name was http on google webmaster and I had it https on my server and it was getting redirect it when try to crawl …
I have almost a day a very good performance for my server.
I am not sure what did the job … but I am happy now :)))
Thanks to all of view that try to help me


closed #7

This topic was automatically closed after 30 days. New replies are no longer allowed.