I’m a huge fan of workers and really love the possibilities it opens up. I have been tinkering around with worker’s KV and noticed that the first read time can be pretty slow. After a new key/value is pushed the first read takes about 250-400ms. After that, future reads to the same key are really fast (<15ms). The data i’m fetching generally won’t be read multiple times, so a fast first lookup is important. In my case it’s faster to go back to the origin and fetch the value than using KV. Is this something i’m doing wrong, or intentional?
It’s intentional (by design), only frequently accesed keys are stored on the edge, infrequent and new keys are stored in central location only.
I miss that part in documentation as well and was wandering how it was possible to maintain all replicas in every Cloudflare location (it doesn’t makes sense financially for Cloudflare).
Read this thread
I think Cloudflade must make this part more clear in documentation. From blog posts when Workers KV was introduced first time, I get an idea that all keys were replicated in every CF locations under 30 seconds.
It still seems pretty slow, even to fetch it from the central location, sometimes up to 400ms read times. It would be really nice if there was a proactive propagation method with a much higher rate limit, something like once per 30 minutes or something.
To guarantee responses 5ms-20ms I use Cache API (beta) and Workers KV together.
The 1st read is slower but also depends how far from the core you are.
Where are you located?
I’m located in the Seattle region, and the origin is in Dallas. I get about 100-120ms of latency directly to Dallas.
Doesn’t the cache API speed up only for second or later requests within each region? (Is the cache, per geographic region or even more granular?)
Also, cache is not guaranteed to be stored until expiry date if I remember right. They can be removed anytime for any reason.
Yes. Speed up after first request and per datacenter. Note that Cloudflare could send the same user to different datacenters for various reasons (traffic, downtime, peering…).
This is just one way to reduce KV requests, and have the workers run better.
At scale, this is the best approach for me. The quicker and less memory your worker takes, the more you can run.