I am currently considering workers instead of classic server, but not sure if scenario I have would be best fit.
I’d appreciate your feedback if it makes sense to use worker for that kind of work.
I have a lot of JSON files stored on S3 which I’d like to expose via API, but JSON in those files could optionally be filtered before returning to the client.
JWT validation -> fetch file from origin or edge cache -> stream JSON array & filter it by filters defined in request -> return filtered JSON array
Things I’d like to handle in worker:
- Authentication - that would be done via JWT and seen examples of that already so I assume it won’t be a problem.
- Filtering JSON arrays (I could switch to new line delimited JSON for ease of stream filtering and use something like https://canjs.com/doc/can-ndjson-stream.html)
- does worker support automatic decompression of origin files when those are compressed with brotli compression? or is it only gzip handled this way? I prefer brotli and it’s what I use currently, but could potentialy give it up and use gzip.
- Is decompression part of CPU time? My JSON files can be relatively large 25-35MB decompressed but highly compressible (up to 1/20th of that when using brotli) and I wonder how CPU time would be calculated in such situation. I guess another bottleneck would be calling JSON.parse on each JSON line, but perhaps I could figure out how to do that differently.
- If response I’ll return from the worker will have ‘Content-Encoding’: ‘br’ will Cloudflare use brotli compression before returning it to the client?
- do you plan to expose Cache API so I could store filtered result response in cache so it wouldn’t have to make filtering again in the future? I’d assume it could work in a way that I’d first validate JWT then check the cache and when response for cache id is already there return it, otherwise run streamed filtering.