How NOT to cache requests denied with .htaccess

I’ve got about 50 bots peppering a website with http get requests. I listed the bot IPs as I discovered them in the firewall / rules section of cloudflare but they were still hitting the server so I also listed them in the server’s .htaccess file to deny the bots access but now legitimate requests sporadically show a cached denial page.

I believe cloudflare is cacheing the denial response, is there a way to create a rule to say don’t cache an http response with certain text in it or another way to stop this from happening?

Thank you,
Will

My first suggestion is to not block with .htaccess, as those block messages are being cached by Cloudflare. Cloudflare isn’t going to distinguish between what Cloudflare thinks is a regular visit vs. what your server says is not.

The bigger question is why Cloudflare isn’t blocking those requests. Just to check…does your server restore visitor IP addresses that go through Cloudflare?

2 Likes

Thank you for your reply. I see I am misunderstanding something. I was thinking cloudflare doesn’t block the ruled out bots because they are going directly to my origination server’s IP but if they were doing that cloudflare wouldn’t cache the replies.

Then I thought, how am I getting the bot IPs? Well, the bots make requests that are logged on the origination server, which is behind cloudflare, so I have likely been using .htaccess to block cloudflare IPs ‘close’ to the bots!

Sure enough, some portion of the IPs I have been blocking are covered by https://www.cloudflare.com/ips/

I changed the DNS of course but I have not done anything on my origination server to make it work in particular with cloudflare. I see that the article you linked me to provides a module I can build into my CPanel Easy Apache to log origination visitor IPs properly and then I can list the bad IPs coming through cloudflare on cloudflare, and the bad IPs coming directly in my .htaccess file.

It occurred to me I could put something in the header of my web server denied response page to bypass cache but I think if I do the above correctly I should be set.

Thanks again.

1 Like