Worker exceeded cpu

This is quite worrying, seems to me that critial projects using workers need to be load-tested on each deployment in case they run into resource issues…

It’s easy to reason about parsin JSON and XML, but now we have to take into account the CPU time of the garbage collector too.

1 Like

The worker we have is running regexes-replaces on documents with a maximum size of about 2 MB. It typically completes in a few ms, but as it’s not streaming I guess it could create a few copies of the entire document in memory before finishing. We’re expecting a pretty high load on this service when it’s in production so it’s important to know that it will scale and if we’re close to hitting any boundaries.
Is there anything we can do to get an idea of how much memory we’re using? Could we for instance use get some data through the worker preview?

Or maybe a way to clear memory more often?

Is it possible to gc manually? Would it be better to dereference variables or wouldn’t it make much difference?

did this ever get resolved? seems like the memory limit is a deal breaker with any level of traffic, no?

So, if you were to use 1MB of memory for each request, you could only handle 128 simultaneous requests??? Doesn’t seem very practical.

I’m working with CF on my issue, hopefully they will provide more insight as I’m seeing the CPU issue with very minimal traffic (<100 requests/min).

Anybody been able to get more clarity on this?

I got it confirmed that the CPU resources are per request, but I think the memory is shared.

After we started streaming the requests we didn’t have any issues as it’s only keeping one chunk in memory. The CPU is however a limiting resource so if you for instance parse large json documents on the edge you might bump in to the limits.

What kind of processing are you doing in the worker?

by resources you mean both cpu and memory?

What were you streaming? html? json, do you happen to have some psuedo code on how you transformed it to streaming? before and after?

Thanks @markus . We store a small json object in KV store, decrypt it to get api keys, then return a fetch request to a 3rd party api. My understanding is that calling fetch() should already be doing a streaming request so we wouldn’t need to implement that ourselves since we don’t bring the result into memory and make any changes to it, just simply return it.

So, its possibly our decryption is occassionally taking too long but I don’t believe that to be the case since it works fine 99% of the time, but when we get a spike in traffic is when we hit our CPU limit, which leads me to believe it has to do with the memory limit, not CPU limit.

So, if we are calling fetch against a 3rd party api, will whatever gets returned from that count towards our memory limit? Lets say each request returns 1MB of data, does that mean we can only have ~128 simultaneous users? I’m mostly trying to understand what counts as memory and how does it factor in? and then how long does it take to reset? Seems like we continue to hit the limit for quite a while until the traffic goes down.

Hi @dev40,

If you are seeing hard errors, it’s most likely from the CPU limit, not the memory limit. We have made adjustments so that when a worker hits the memory limit, it is still allowed to finish in-flight requests; only if it goes over the memory limit by 2x do we actually cancel requests with errors. So errors would only happen if your memory spikes suddenly.

Note also that the 128MB limit applies per worker instance, not globally. Each instance of your worker around the world is allowed to allocate up to 128MB. We not only run separate instances in each datacenter, but we often run many separate instances across different machines in a single datacenter. So if each request allocates 1MB (which, BTW, is a lot!), then you’d be able to support 128 simultaneous requests per instance, which probably translates into 10k-100k simultaneous requests worldwide.

And yes, if you are passing through a Response object without reading its body explicitly, then it will automatically stream; the body won’t count against your memory limit at all.

How is your “decryption” implemented? Is it symmetric or asymmetric encryption? Is it implemented in pure JS or do you use WebCrypto? I’d strongly recommend using WebCrypto for all encryption/decryption as it will be much faster and avoid timing side channels. If you’re using asymmetric encryption, note that decrypting even a small value can be quite slow; symmetric is much faster.

Thanks @KentonVarda - that will be a good change on memory usage.

So then if it really is CPU limit, then the only that could be causing it I would think is the decryption. I guess I dont understand why it would be variable cpu usage? I am always decrypting the same value so should be about the same time every time I would think. So that still seems confusing to me.

Edit: Kenton answered this here: Cloudflare Workers - Still exceeding cpu

I’m using a simple wrapper library (npm cryptr) which just wraps around the node crypto library and uses following:

const stringValue = String(value);
    const iv = Buffer.from(stringValue.slice(0, 32), 'hex');
    const encrypted = stringValue.slice(32);
    let legacyValue = false;
    let decipher;

    try {
        decipher = crypto.createDecipheriv(algorithm, key, iv);
    } catch (exception) {
        if (exception.message === 'Invalid IV length') {
            legacyValue = true;
        } else {
            throw exception;

    if (!legacyValue) {
        return decipher.update(encrypted, 'hex', 'utf8') +'utf8');

    const legacyIv = stringValue.slice(0, 16);
    const legacyEncrypted = stringValue.slice(16);
    decipher = crypto.createDecipheriv(algorithm, key, legacyIv);
    return decipher.update(legacyEncrypted, 'hex', 'utf8') +'utf8');

I couldnt tell you if that is asymmetric or not :slight_smile:

But I will look at converting mine to WebCrypto

Hi @dev40,

TBH, I don’t know how cryptr could work in Workers. Workers is not Node-based and does not provide the Node crypto library; it provides WebCrypto instead. It looks like cryptr requires node crypto. Are you using some sort of polyfill for that?

FWIW, it looks like cryptr uses AES-CTR which is a symmetric algorithm. With WebCrypto I would recommend using “AES-GCM”. These are not exactly the same, so your data would need to be reencrypted, unfortunately. However, AES-CTR does not provide authentication (i.e., an attacker who can modify your data could tamper with it without you knowing, even though it is encrypted); AES-GCM does.

1 Like

Hmm, interesting. Good catch. It does appear that the crypto-browserify is being packaged into the final output. And it appears it may be coming from the serverless-Cloudflare-workers lib ???

Regardless, I will switch to WebCrypto and appreciate the advice on algorithms. I’ll do that.

1 Like

Hi @KentonVarda - i’ve made the changes to WebCrypto and still hitting CPU exceeded. I’ve been hitting my head against this for hours and could really use some help. I’m at a complete loss. I’ve made a video describing what I’m doing and seeing. I can also share code with you if you want to take a closer look at it. Here is video:

This is a showstopper for me to use workers and holding up our product launch so would greatly appreciate your help/insights.

Are you parsing the form HTML? Even parsing HTML would consume too much CPU-time.

And which CF Plan are you on?

nope, no parsing. Just fetching and returning.

I’m on PRO plan.

That’s very weird, can you try removing parts until you no longer see the issue?

Edit: Sorry, I didn’t want to sound condescending, It’s just that as workers are now - trial and error is basically the only method.

Edit2: Considering that you use quite a lot of data, it might be memory that’s exceeded and not CPU time? I believe they are reported as the same.

Hi @dev40,

Sorry, but unfortunately I just don’t have the bandwidth to help debug large code projects.

If there’s a specific problem that you think is a bug in workers, and you can reproduce it in a small self-contained test script, I’m happy to look at that.

Hi @KentonVarda - i totally understand that you can’t review code.

what about sharing any data on your end that could assist? Surely there is a way to see how long my worker’s cpu is taking on your end. There has to be a way to share that information as we are flying blind with regard to that constraint with no way to see/test for ourselves.


If it’s only happening rarely then can’t you use a retry mechanism? Host the heavy call at a separate subdomain/route so that it runs in a separate function invocation and call it until it works?

hmm, interesting idea. but that would essentially double our usage since it would require 2 calls for each query. will have to think about the consequences of that but i appreciate the suggestion.