Concurrent requests & unique data?

I’m building a worker that will take an incoming request, get some third-party data, and mutate that data before returning it to the user.

So far, my code is working, but I think I’ve made a mistake in architecture and I’d like to validate that I’m correct in thinking I discovered a big mistake.

My question is this:

Can a singleton contained in a closure in one instance persist during the execution of more than one request handled on that instance?

Background

The third-party request is built up by taking user cookies and the specific request to build up multiple third-party requests. Then we get DOMs from the third parties and stitch that all together and mutate it. In order to minimize the number of logic actions, request accesses, and DOM accesses, plus avoid duplicating potentially-large DOMs, I’m storing the request and the responses in a singleton.

The Mistake

Looking at How Workers works · Cloudflare Workers docs, I see that

“Like all other JavaScript platforms, a single Workers instance may handle multiple requests including concurrent requests in a single-threaded event loop. There’s no guarantee whatsoever whether any two requests will land in the same instance; therefore it is inadvisable to set or mutate global state within the event handler .”

This sounds to me like we can’t guarantee that two requests will end up on the same instance in the same event loop; but we also can’t guarantee they won’t.

So, thus my question:

Can a singleton contained in a closure in one instance persist during the execution of more than one request handled on that instance?

Sounds like the answer is “possibly,” and I’d like to validate that, because I need to change my architecture if true!

Yes, definitely. A single worker instance can handle an arbitrary number of concurrent requests.

Without seeing example code, I couldn’t say precisely whether or not your current code is unsafe, but it does sound like it. Generally the only things that make sense to store in the global scope are immutable configuration data that is generic to all requests, or logging/telemetry data that is being batched before being sent out to some logging server.

Even if you were to distinguish between different requests’s data, e.g. with a global map, the worker’s memory usage would become a function of the number of requests it has seen, ultimately leading to eviction, which would increase the likelihood of a cold start later.

If you need to cache content specific to a request / session / user, consider storing it in Workers KV or the Cache API (depending on your durability needs) under a key unique to that request / session / user.

3 Likes

That’s what I was thinking, @harris, thank you!

It looks like there isn’t something inherently provided in the event that is delivered from the fetch eventListener that would be a unique key for a request; or a unique key type within the Workers KV api; is there a best practice for developing a unique key other than “generate a UUID using the many way that are available”?

@warmstrong, if you’re able to read a unique session cookie from the request, that would probably do it. It’s a bit outside of my realm of expertise though, so I’m hesitant to endorse a particular strategy. Hopefully others might have some ideas.

1 Like

Thanks, @harris, just making sure I hadn’t missed something in the docs!

I came here with a similar question except the problem I see with storing stuff in Worker KV keys is that I expect multiple simultaneous requests at the same datacenter which means that request 2 will come in before request 1 has a chance to store anything in a key. Therefore simultaneous requests will have no idea that request 1 is already doing the work.

It is a classic concurrency problem that requires a semaphore but there is no concurrency protections such as semaphores in Workers is there? Is there any solution to this?