Highest throughput observed on Workers?

This is more of just a question thread; but I was just wondering what the highest/max throughput the Cloudflare Workers team has observed? I know Workers can handle like 100,000’s of requests per second… but just curious what the highest # was? :slight_smile:

Could workers scale to say 10,000,000 rps for example? How would scaling work in that case… or would the backend infrastructure crumble over that load?

I bet only Support could answer that. In my case, my wallet would crumble first at $5/second for Workers requests.

This quote from the billing page also gives me pause:

First 10M requests are included without surge limits

Yeah not focused on the pricing aspect; I’m just curious what they have observed, and what they are able to handle :wink:

1 Like

Workers is just the consumer version of their Edge software. So throughput is whatever their edge nodes can handle for all web traffic.

1 Like

I’ve done distributed load tests at 1666req/s without any slowdown in response-time.

And yeah, It’s a good idea to enable rate-limiting (5$/1Million/good req) in any case, you don’t know if malicious actors are going to DDOS the endpoint for whatever reason.

4 Likes

Yeah definitely have my rate-liming enabled @thomas4!

1 Like

@thomas4 what about non-distributed i.e. single IP address client ? Considering Limits · Cloudflare Workers docs

Cloudflare’s abuse protection methods do not affect well-intentioned traffic. However, if you send many thousands of requests per second from a small number of client IP addresses, you can inadvertently trigger Cloudflare’s abuse protection. If you expect to receive 1015 errors in response to traffic or expect your application to incur these errors, contact Cloudflare to increase your limit.

I tested from single IP client against my CF worker caching and without any rate limit rules set by me, CF rate limited the single IP client

image

1 Like

This has been mentioned a few times lately, see:

https://community.cloudflare.com/t/massive-rate-limiting-issues-with-worker-in-production

For a very long discussion about these issues.

Worth mentioning is that a DDOS with a very large amount of unique IP’s, like IOT devices like fridges, cameras, alarms etc, would not be rate-limited - so an attack like that can be come expensive, fast.

2 Likes

Thanks for linking to that thread - lots of info discussed there.

@harris would be nice to have more detailed Worker analytics which reported request rate as well for Pro and Biz plan users too. I think Enterprise does have such analytics already ?

1 Like

That thread @thomas4 seem’s to be an issue they were having with subrequests? I’m just literally asking… if I have say 500,000 thousand (well intentioned) users all navigate to my site at the same time… will some of the requests be throttled/blocked? Since i’m on the pro plan… what types of limits am I going to face?

This question seems to be very hush hush around the workers community. I’ve even asked staff on the Workers team and they either don’t respond or I can’t get a straight answer. I mean sure the infrastructure is most certainly complex, and can’t be explained in 1 line… but there must be something they can share about getting x requests -> what happens :stuck_out_tongue: I want to ensure that if my platform get’s big; really fast, that i’m not going to be running into all these request limits.

500 billion users wouldn’t be quite the feat ! :astonished: But those will be on different IPs not the same IPs so less likely to be rate limited ?

1 Like

@eva2000 made a typo. I meant to say 500,000 users (not million). Sorry! This is definitely a drastic number, so something like 20,000-50,000 would be more reasonable.

And something else to add. If I get 500,000 users to my platform; I’m well aware I’m paying for all of those millions of requests that are made (Obviously requests would be cached too). So if i’m paying for what is used (I don’t understand the surge limits). It sounds like the surge limits are in place, from a hardware point of view? Is this because of technical issues that limits how many requests go in on 1 node? If so… why (as a paying customer) am I having to experience surge limits, because of the lack of bandwidth/traffic allocated to 1 node/region (if this is the case).

I’m assuming if there’s an influx of traffic on 1 node/region… users would be sent to a different workers node/region? And not just have their request cancelled?

As you can see from the linked thread, probably a good idea to ask CF tech support via a ticket as non-CF staff here like myself can only infer their policies from our interpretation of the publicly available documentation/policies online most of the time.

1 Like

Thanks @eva2000 I guess it all comes down to how requests under load are mitigated to other regions (which i’m guessing happens… maybe?) Even if I didn’t have caching in place… and I had 25,000 users come to the site at once, if the load on 1 single node is getting too high, do requests get requested from a different node/region.

Would love to know when the “surge limits” come into play. If they’re set in place to stop bad intentioned traffic; I have rate limiting rules for that. I can protect myself as a customer just fine!

I’ve cc’d @KentonVarda on Twitter to this thread. So hopefully he can enlighten us a bit!

Each CF datacenter can handle alot of traffic and CF serves visitors from their distributed set of servers from all their CF datacenters. There’s more info as to how CF works at https://www.cloudflare.com/en-au/learning/

I’ve had some of my users of my Centmin Mod LEMP stack report that behind Cloudflare they are pushing >1 million visitors per day on CF Pro plan without issue (though without CF they were still pushing 1+ million visitors per day too). But this is without CF workers

:wave: @dmitry,

Cloudflare’s network processes over 14m web requests a second on average, so I imagine the peaks are significantly higher than that. Likely a reasonable percentage of those requests use workers today and one can probably reasonably assume there is capacity beyond what their current peak utilization is.

The question is probably a bit of ‘how long is a piece of string’ though. A customer planning 10m rps would likely be having discussions with Cloudflare about a variety of topics regarding not just RPS but also throughput, traffic patterns and a host of other topics.

— OG

2 Likes

true and with Cloudflare’s move to more performant Gen X AMD EPYC 7642 cpus for their servers EPYC, I am sure their capacity will be increasing too :smiley:

2 Likes