Why the same Worker script can have different performance in different Cloudflare accounts?

(Latest Comment from thread creator (Last updated: 10 Feb 2020): This thread has been closed by moderator without an actual solution, and the “solution” being picked by the moderator does not really solve the problem being raised as well;

Since Cloudflare does not have an official solution for this issue, what I did is to abandon my original account, and then register a new account to get a new Worker subdomain, then this issue is solved by this workaround; I had to tell this is only a workaround I made myself just because I yet to launch any service in production, Cloudflare team should consider how to solve this problem in a better manner to avoid such performance difference among different worker subdomain used by different worker accounts)

Hi there
I am discussing with my friends for the performance difference between our worker accounts, both of us are also using unlimited plan of Cloudflare;

While we found out that given the same worker script we deployed, somehow the performance can be very different with 2 different accounts.

We spent many times to see if any config differences among our accounts, but don’t really have any idea.

And at the end we just used a very simple script to check if our worker will have performance difference, somehow it does show a difference.

Our simple testing Worker script:
addEventListener(‘fetch’, event => {
event.respondWith(handleRequest(event.request))
})
const handleRequest = async req => {
return await fetch(‘https://www.google.com’);
}

And then we found out that one of the worker domain is always loading slower than the other one
Slower one: https://cfstatic.xxx.workers.dev/
(always take longer than 200ms or even longer to load one resource)

Faster one: https://cfstatic.yyy.workers.dev/
(can always load one resource within 100ms)

can any of you help to explain why there are such differences among different Cloudflare accounts given the Worker script is the same?

And they are tested from the same location?

yes, they are being tested from the same location and same device
To avoid it is just the problem of my own computer, I also asked another user to test it once, and we can still see the performance difference. Please help to check once what is the problem?

And this difference actually brings a big impact to us, we are seeing the loading speed difference can come up to 4-6 seconds when the response file is larger; this makes us losing our confidence to release our worker into real use when we have such uncertainty.

I’m assuming that you’ve already created a ticket for this?

@KentonVarda You usually have a logical explanations to these things :wink:

Thanks for the reminder, just created a support ticket as well; but this is really affecting my work now, I don’t really dare to tell others that my script is ready under my worker account because of this…

I understand, I’m also ready to launch my service based entirely on workers soon.

However: Workers embedded in apps - still supported?

I doubt it is the script which is taking longer. Rather the connection to the edges.

Can you post the output of these two URLs?

  • https://cfstatic.donaldchan.workers.dev/cdn-cgi/trace
  • https://cfstatic.appi.workers.dev/cdn-cgi/trace
2 Likes

Sure thing,

For the slower one (https://cfstatic.xxx.workers.dev/cdn-cgi/trace):
fl=20f155
h=cfstatic.donaldchan.workers.dev
ip=112.XXX.XXX.71
ts=1581241694.98
visit_scheme=https
uag=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36
colo=AMS
http=http/2
loc=HK
tls=TLSv1.3
sni=plaintext
warp=off

For the faster one (https://cfstatic.yyy.workers.dev/cdn-cgi/trace):
fl=35f343
h=cfstatic.appi.workers.dev
ip=112.XXX.XXX.71
ts=1581241692.851
visit_scheme=https
uag=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36
colo=SIN
http=http/2
loc=HK
tls=TLSv1.3
sni=plaintext
warp=off

In one case you are routed through the Netherlands, in the other through Singapore. That explains the difference. It is not worker related.

1 Like

But we cannot control this; how to solve the difference?

You cannot solve it. Please use the search, that topic has been discussed a gazillion times.

That means there is no solution for this? Then my worker account must route through Netherland regardless of any condition?

Again -> the search :wink:

I tried to search around already:

But this does not help to explain why this is affected by different worker accounts

1 Like

This is one of the topics. Requests to these two servers are basically routed differently by your ISP. You’d need to contact your ISP about that if you want the same routing.

1 Like

Thanks for the explanation, it makes sense in general terms; But again, it sounds to me “no solution” is the final answer, that means we still cannot tell why ISP will choose different datacenter to route and how we can solve it; it is not really practical for me to contact all ISP around the world to ask for the best routing

that means I can only accept that my worker account will always be slower than my friend’s one when that is being loaded in same region (with same ISP)…

#Tutorials has more on that

Peering - Why don’t I reach the closest datacenter to me?

We did a lot of distributed load-testing on Workers about half a year ago, up to 50 000 simultaneous users logging in.

See:

What each request is doing:

  1. Validate user in KV & check rate-limits.
  2. Validate eMail & IP blocking lists
  3. Create password based on PBKDF2.
  4. Write pass to KV & Generate session.
  5. Write to external log system.

So this is the performance you can expect when doing maximum worker CPU-time, reading 5 and writing 2 KV values.

Even with this enormous load, the average request response is ~348ms - which is extremely good imo.

I should note that the majority of the delays are due to KV writes.

1 Like

Thank you so much for both of you of all the explanation, it totally makes sense;

just really disappointing for me that “we cannot solve this problem, contact ISP for better routing if needed” is the final answer only… When loading with multiple files, this difference can actually stack up to “seconds” instead of “milliseconds” in front-end, that’s why during our test we are cautious of this difference;

This to me is not a good business solution only;

Again, make sense in technical terms.