Listing items returns inconsistent state

I want to render my KV content in a table. Currently I have

const {keys} = await STORE.list();
const list = await Promise.all(keys.map(({name}) => STORE.get(name, ‘json’)));

The code works, when I return the list, I can render it on the client side but there are two things:

  1. if I create new item, refreshing the page fetches the list, and the old data is returned, e.g. I must wait some time before the created item is included in the list

  2. when I delete some key he item is removed after some time similarly as when creating one, but also sometimes I receive null values in the list, because while the keys were not removed, just the values were. I don’t want to iterate all the results and filter out nulls… that’s ugly

I would like to achieve the same performance for listing my KV storage as you can do in the cloudflare KV Dashboard. The delete operation just works, after page refresh, I don’t have the old items in the table… similarly for creation.

I understand that my issue might be due to slower write/delete compared to get operation, but how is it possible that the KV dashboard is so performant and my solution is not? Am I missing something?

Writing an item and then reading it immediately is possible, but you are using the list command to get the changed item - unfortunately the list feature use the API and has a delay, just as you say. So what I do is write and then immediately read the item and then return that to be included in the list.

1 Like

Yes, I can return otem after create and update my table in the UI, but when I refresh the app, I must get the list again, because I dont know my keys. I dont want to cache the keys on the client. In this case I receive an outdated list again…

@thomas4 what do you mean that the list feature use the API? Like the rest endpoints? Is that different to how the put/get/delete are implemented in the worker?

I see no other way around that, you’ll need to cache the changes.

What I’d do is cache the changed items only and then merge them with the actual list, when you see the same list ID’s that you have in the cache you can remove them from cache.

The worker implementation is likely just an API wrapper, it implements the API as a function.

Only Cloudflare can answer this properly, I’m sure they are aware of the Write -> List delays.

Ad api: I too imagine the GLOBAL.put() to be just a convenience methods instead of doing fetch requests…

Ad delays: As I mentioned the KV dashboard works flawlessly from the UX perspective, so I would like to know how it’s implemented. Maybe the data from the KV store are accessed in a different way, not through worker. I would like to get comments on this from somebody on the workers/kv team. I was asking about this a business guy who was offering email help which I got after going to unlimited workers. He mentioned that he will pass it to the kv team, but I have no reply yet and thought that somebody here knows how to solve this.

The KV dashboard read directly from the central KV database, that’s why you see changes there immediately.

So KV syncs like this this:

Worker write -> Central KV -> Worker.

So you’re reading it before it syncs back to your Worker where you wrote the value, ie. not available in the listing.

So with .get() it’s immediately available, but not on .list(), at least not yet.

Well, then I can use single key e.g. ‘list’ where I’ll keep array of all my keys (within limits of max. stored value size). When I set new key or delete one, I’ll update the ‘list’ key too.

That way, I’ll have list working as a regular item which won’t have the delay penalty. I don’t see why the list operation is different than get…

Is there some low level stuff on how the KV store is implemented? I saw one talk on workers how they use isolates etc, but I don’t have anything on KV store.

I had the same idea, however, you are only allowed to write to a single key one time every second.

Which doesn’t really work well for an index :frowning:

Indeed, but I was testing that and I was able to write with higher frequency.

Take for example incement function

async increment(request) {
  const value = Number(await STORE.get("x")) // first time it's null and Number(null) is 0, next time it receives 
  string and Number(string) is number so value is number
  const next = value + 1;
  await STORE.put("x", `${next}`); // store incremented value as text/string
  // ... send response here with `next`
}

Now if you call the endpoint multiple times (curl, js postman whatever), all the calls are recorded and the value properly updated

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>
<script async>

    (async () => {
        const a = await (await fetch("https://storage.domain.workers.dev/api/increment")).json();
        const b = await (await fetch("https://storage.domain.workers.dev/api/increment")).json();
        const c = await (await fetch("https://storage.domain.workers.dev/api/increment")).json();
        const d = await (await fetch("https://storage.domain.workers.dev/api/increment")).json();
        const e = await (await fetch("https://storage.domain.workers.dev/api/increment")).json();

        console.warn(a, b, c, d, e);
       // logs, 1, 2, 3, 4, 5 the first time the page is loaded 
    })();
</script>
</body>
</html>

If I’m not mistaken it caches the write, actual simultaneously written values should get errors in the console.

Also, you’re awaiting all your requests in the test you need to use a Promise.all.

Good point, I should try parallel calls.

But I was testing the sequential just to get touch on how it works. The delay between requests is still less than 1 second - so what do you mean that it caches the write? Where and who? The worker?

From the worker we can have simultaneously opened 6 connections and the writes to KV won’t be atomic I guess and I’ll get errors as you suggest - or just some of the updates will be ‘lost’ as they will work with outdated data. e.g. races

Still, the KV is not silver bullet, and probably for my use case I should use different storage. Listing is very important to me, and I would even welcome reducing of the values… or simple aggregation like counting items with the same prefix in the key… but that would require some computations in the KV, so probably I won’t get that easily

1 Like

I’ve looked at different distributed KV and all have same race issues, seems to me that caching the write locally is the most safe way to deal with it. Of course multiple users writing at the same time will always give list out of sync…

Hey there, sorry about this. I’m the PM for KV. Feel free to cc me on threads like this in the future, and I’ll try to respond as soon as I can.

There’s a big difference between the two, and that’s the list operation needs to know every possible key, whereas get only needs to know a specific key. Because different people can be writing to different keys simultaneously, we can’t be as aggressive with list operations as we can with gets; in some sense, a write to a single key can only modify local state, that is, a single value, and so we can return that new value immediately on a get. But a list operation needs to take into account global state, the whole namespace, and so it lists have to go back to the central store (as @thomas4 noted)

If you write from a worker, it will cache the value of that write, on that specific colo. So you’ll see these consistent results because you’re in the same colo at the same time. If you were to run this script from multiple places around the world simultaneously, you may see very different values.

@sklabnik thanks.

Can I get details on the central store? I know that the workers KV is eventually consistent, but I don’t know how the workers on different locations sync with the central store. I guess that updates are not pushed from the central store to each of the workers. Maybe the worker does pull from the central store every time if the key is not found locally?

Maybe the details are not so important, in the end it seems like writing to the same key is not happening so often in my use case, but still I’m curious.

Still the list operation is troubling me. I’m still tempted to use single key for storing the list. I mean not list of each item, but a limited set in some context. Like, for workers sites, if you have index.html pointing to dozens of resources which are in the KV, then in a sense the index.html is a list operation. I can mimic this for my use case.

Correct, there’s no push, only pull. Push is something some people have asked for, and we may add it in the future as a premium feature, but I’m not sure yet.