Edit: See sdayman’s reply before reading this. It’s not Cloudflare’s crawler. Please ignore this post
I noticed in my server monitoring software that nginx was returning a lot of 404s, and looking into the logs I see the requests were coming from Cloudflare’s IPs (like 18.104.22.168 and 22.214.171.124, for example).
I had some old commented-out script tags on my template page (that’s used to render almost all pages on the website) like this:
<!-- <script src="js/thing1.js"></script> <script src="js/thing2.js"></script> <script src="js/thing3.js"></script> <script src="js/thing4.js"></script> -->
So the first bug here is that Cloudflare is attempting to crawl files that don’t exist, but that’s not a huge deal, because it would just be 4 requests.
These files are predictably stored in
But the crawler was tens of thousands of requests like this:
https://example.com/page-on-my-website/js/thing1.js https://example.com/another-page-on-my-website/js/thing1.js https://example.com/different-page-on-my-website/js/thing1.js ...
If I remember correctly the interpretation of something like
src="js/thing4.js" depends on whether there’s a trailing slash on the page being loaded. My site doesn’t have trailing slashes on page URLs so I think it should be treated as
I’m not sure though - just quickly reporting this in case there are bugs here. It’s an easy fix for me - I just removed the commented-out script tags and added a preceding slash to the remaining ones like this: