Large number of requests from CDNs

Hi

I work for an e-commerce company that sells IT equipment, specialising in gaming PC components. We are constantly fighting against crawlers who are either doing competitor analysis or trying to purchase high-resale items like the latest generation video cards to resell privately (RTX series is especially an issue). The CF Bot Score Machine learning is practically useless for us, but that’s a different story…

Our product images are hosted on a separate sub-domain from our content website (it’s proxied to an S3 Bucket). Every day there are around 4,000,000 requests to this sub-domain, and nearly 1/4 of these requests are the three biggest CDNs. All requests have the User-Agent “Mozilla/5.0”, and no HTTP referrer :

  • Cloudflare ~ 670k / day
  • Akamai ~ 150k / day (we do not use akamai)
  • Fastly ~ 50k / day (we do not use fastly)

I am suspicious of this traffic, because there is a substantial spike every day at 6pm local time (0700 UTC). This domain only contains images, there is not HTML or other text content to crawl.

At first I thought, the Cloudflare traffic at least, might be related to some cache clearing, but that wouldn’t explain the akamai or fastly traffic. Also, in this 24hr period, some images get a very large number of requests, with dozens of URLs receiving > 60k requests. We never overwrite or delete images, so they should not be invalidated on the cache. Any help understanding what is going on would be greatly appreciated.

Thanks in advance!