Hello! I have one small file under 1mb in size that needs to be hosted online and updated frequently. Specifics:
-
The file is a GTFS-Realtime binary .pb file that would be updated from our end at 20 to 60 second intervals, 24/7/365.
-
It would be getting downloaded by anywhere from 10 to 1000 concurrent users at the same frequency. The consumers are programs of app developers etc. They need to fetch the latest data every 20 to 60 secs too. If we were directly serving the consumers from our own website, this would add up to a monthly data transfer consumption of 1.3 to 130 TB per month if I assume the file size to be 1MB and frequency 20 secs. We cannot afford that.
-
The source website where we publish this file, will not be able to handle the high loads of the consumers. So here we need a mirroring service that can copy over our file and make it available at their end, every 20 to 60 seconds. Some latency is ok but not too much… the timestamp in the feed cannot be older than 65 seconds or it is deemed invalid.
-
The catch is, this is realtime stuff. We do not want to cache older versions of the file. We don’t have huge data to distribute… just one small file albeit one that is updating frequently. The mirroring service needs to copy our latest file, and CDN it.
-
For our source website, only Cloudflare should be copying/downloading the realtime files, no one else. We shouldn’t unexpectedly run out of available bandwidth on the second day of the month.
Is Cloudflare good for this? I was checking out AWS over the last week and someone advised me that Cloudflare is better for this use case. But I’m concerned about caching. I’m seeing docs about turning off caching, but right now it looks like that will direct consumers to download the file from our original site, thus screwing us on bandwidth.
Thank you for taking the time to read this!
PS: Kindly advise if I should change the category of this post. I’m new here so filing under “Getting Started”.