Batching in Workers not working so well

In my worker, I’m doing what e.g. Scott Helme described in his blog post: when processing an incoming request, add something to a batch and use event.waitUntil() and a sleep function to process the batch every 10 seconds or so. My use case is to track API usage for heavy users who send many requests/sec to the API from a single location. While waiting for Durable Objects to become generally available, I’m implementing the batching in Workers (write batch contents to KV in unique object every 10s).

This works just fine when testing with wrangler dev and wrangler dev --env=staging (send 10 requests to the worker in a short time => batch has 10 items => one write to KV)

However, the batching hardly ever works on my staging env (workers.dev) and on production: typically, when sending e.g. 10 requests within a few seconds, 9 or even 10 unique objects are written to KV. Debugging shows the batch variable (outside the handleRequest function, of course) is nearly always empty when processing an incoming request, while it should be empty only once in a 10s timeframe: for that first request (for next requests, simply add to the existing non-empty batch variable)

I know multiple instances of a Worker can run in a single datacenter, but the behaviour I’m observing is surprising: it seems many instances are spun up (even if all requests are for same URL path and come from same client IP; not sure this is relevant).

My questions:

  1. is it common for Cloudflare to spin up many instances of a Worker in a single dc?
  2. do worker instances have an id or something that can be accessed from inside the worker while executing?
  3. besides ‘many worker instances’, is there an other potential cause of the batching working poorly?

Thanks!

Aaron

It took us a bit of fiddling to get batching right but you can take a look at what we’re doing here: cloudflare-app/worker.js at master · Logflare/cloudflare-app · GitHub

1 Like

Thanks!

On first look, what you’re doing in scheduleBatch and postBatch functions and using a global scope variable (logEventsBatch) is ver similar to what I’m doing.
Will look more into this tomorrow.

Off-topic:
I believed to have read Date.now() does not advance during worker execution, but I guess that is not true given you’re using this for originTimeMs. Does that work ok?

Yep it works (notice the p99 of origin_time is varying here):

1 Like
  1. Yes, depending on load and workers can spawn in several machines within a colo
  2. No that I am aware, but the code below will do
const makeid = (lenght) => {
  let text = "";
  const possible = "ABCDEFGHIJKLMNPQRSTUVWXYZ0123456789";
  for (let i = 0; i < lenght; i++)
    text += possible.charAt(Math.floor(Math.random() * possible.length));
  return text;
};

let workerInception, workerId;

const handleBatch = async (event) => {
  if (!workerInception) workerInception = Date.now();
  if (!workerId) workerId = makeid(6);
...
}