Type
New feature
Description
Provide fallback to Worker for R2 Public Bucket requests
Benefit
Public Buckets and those which are accessed via pre-signed URLs, allow implementing nice simple architectures for serving static, or infrequently updated data where Objects can be created/updated programmatically in response to changes in the source data; however things become significantly more complicated in cases where the requested data may not yet be available as an Object in the Bucket, with the application consuming this data then having to handle the 404 responses and make a separate additional request to a different endpoint (probably a Worker) in order to trigger some action, such as generating the missing Object.
For example we might be migrating from a legacy system, but only want to migrate the specific data as required upon first request, so the ability to trigger a Worker script which can pull this data, return it, and create the R2 Object for future requests would be incredibly useful.
The commonly proposed approach of routing all requests through a Worker which pulls the data from the R2 Bucket, greatly increases operational costs, as we not only have the cost of the R2 storage and GetObject operation, but also a Worker invocation and CPU time, and for systems with a high volume of requests this could be cost prohibitive.
A fallback feature like this is available on the Object Storage offerings from other Public Cloud providers.
The Static Assets feature of Workers kind of has something similar to this capability, in that requests where there is no matching Static Asset are routed to the Worker, but it appears to be implemented in a very different manner, using an Asset Manifest to identify the assets to be served, rather a simple fallback approach, and once deployed these static assets can’t be updated programmatically in a Worker through the ASSETS binding, and that’s not available to other Workers anyway. Whilst it would be possible to use Direct Uploads to add/update assets, this is rather long-winded, and the dynamically created Objects uploaded in this manner wouldn’t survive a redeployment of the Worker, unless we also added them to source control, or another way to persist them were introduced. This is messy, completely impractical for large numbers of Objects, and I suspect would perform very poorly in such cases.
There’s no way to implement this using the standard Page Rules, and whilst I appreciate that Custom Error Rules would allow this, and I’m certainly no averse to using them, they’re only available on the general (not Workers) paid plans, and this really does feel like a very basic feature which belongs within R2, at a Bucket level, rather than something which must to be configured at the domain/site level.
From a Cloudflare user perspective, I see this feature being implemented as a simple URL, defined on the Bucket, to which the request would be forwarded instead of simply returning a 404; or failing that set in the location header of 302 response.