Is this Cloudflare caching behavior expected?

I am using Cloudflare to cache images from S3, because I want to reduce data transfer out from S3.

In S3, I added metadata to all objects to set the Cache-Control key to a value of max-age=2592000. In Cloudflare, I set Browser Cache TTL to 1 month.

I then did some testing like this:

  • Load image A from computer (home WiFi)
  • Load image A from computer (home WiFi) again
  • Load image A from phone (data plan, NOT home WiFi)
  • Load image B from computer (home WiFi)
  • Load image B from computer (home WiFi) again
  • Load image B from phone (data plan, NOT home WiFi)
  • Load image C from computer (home WiFi)
  • Load image C from computer (home WiFi) again
  • Load image C from phone (data plan, NOT home WiFi)

I assume that my phone data plan and home WiFi should originate from very close locations, so I hoped/expected that Cloudflare would cache the image files between them. I thought this testing would result in 3 GET requests to S3 and 6 caches.

Instead there were 6 GET requests to S3 and 3 caches. Only the requests for the same image from the same network (home WiFi) were cached. Each image was retrieved from S3 twice, once from my home WiFi and once from my phone data plan.

I think at least 1 of these things must be true:

  • My assumption that Cloudflare caches objects for requests from different networks in close locations is incorrect.
  • My phone data plan network and home WiFi network are not actually in close locations.
  • I have not properly configured S3 cache metadata and/or Cloudflare caching.

Can someone clarify this, or at least point me to steps to take to diagnose from here? If I can’t get Cloudflare to cache images for requests originating from many different networks in nearby locations, then it won’t work for what I want.

Or option 4) There is more than one caching server in each data center.

Thanks, that explains it. Is there a way for me to figure out how many total caching servers and data centers there are? I am trying to understand with my setup how many times a given file might need to be transferred out from S3 during the max-age period, because this will help me project costs.

There’s no way to accurately forecast that. Multiple caches in multiple data centers, with random hits from all over the world. And a file will most likely be evicted from the cache before it expires.

With no way to forecast that, I think a botnet could potentially flood my service with requests and result in an uncapped amount of data transfer out from S3 and thus uncapped costs to me. If this is correct, is there some way I can mitigate that within Cloudflare?

You can try Rate Limiting:

This topic was automatically closed after 14 days. New replies are no longer allowed.