KV Conflict Resolution


The docs for KV says it is “eventually-consistent”. I am looking for more detail:

  1. What if a key is updated to different values at the exact same millisecond in two different Cloudflare cache nodes? Is the value that wins arbitrary?
  2. Does Cloudflare guarantee no clock skew between cache nodes?
  3. If there are offline replicants of the data which come on line and push to Cloudflare with different values for the same key, I will have to implement my own conflict resolution, correct?


Great questions!

  1. Writes in KV happen centrally, so it would be defined by which hit that central system first.
  2. Nope
  3. There shouldn’t be a possibility for conflicts as it’s not multi-master.


@zack Thanks

If two people update the same key at the same time via two different cache nodes with two different values then there is intrinsically an ephemeral conflict prior to commit. So, based on the answer to question 1 and 2 network latency will drive which one wins. In this case the last to get to the central server will be the value one could expect to retrieve some time later. Correct?



@zack There is also no guarantee the data is not multi-master. In fact, we are using the Cloudflare KV store as a highly available replicant of peer-to-peer datastores similar to GunDB. Ultimately, the consuming peers will resolve conflict, although the KV store may introduce a little thrashing.



@syblackwell I’ve interpreted it as that reads are “eventually consistent” while writes are not written at the node but centrally. Though, I’m going to test this very soon to see how it actually performs.

What you’d want to do, if you write from a worker and the want to read the value directly - is make sure that you compare the value to the value you’ve written so you know what to expect and if it doesn’t match - retry in a few seconds (CF claim max 30 sec global consistency, usually <10 sec).

And you’d want to do this validation in the same request where you write the value…

If you validate it on a second request, you don’t have a state to hold on to (no global variable for example) because you might call a completely different node on the next request (which cannot be fixed). And neither KV or Cache is immediately consistent.

Which means that you the only way to validate the written data on the next request is by caching the value on the browser/client.



Which is what we are planning to to plus add an algorithm similar to to HAM: https://github.com/amark/gun/wiki/Conflict-Resolution-with-Guns.

I would like to know your results after you test your hypothesis.



I don’t see reason to worry about updating data. You can make same argument against RDS or NoSQL databases. Lower latency always wins when it comes to updating records.

Real issue is INSERT (writing data first time). What will happen if two users try to register same username same time and KV key is their username? There’s a chance that both of them will get notice that their registration was successful but in reality only one of them is registered successfully, KV storage has specific use cases and if there’s a chance that key might get duplicated at some point, you have to avoid using KV storage.



Users are not really an issue if you verify what’s written to the KV.

Only one of the multiple users that register at the same time will have consistent details.

You will need to verify more than only the username though, so username + email at least - preferably all of the details or there might be some detail that slip through if you write to multiple different keys.



Yeah, it means that you have to read back user details after registration and compare to additional details. Fact is that it’s not as easy to do as in PostgreSQL for example.



Agreed, Gun is unique in “solving” the distributed DB problem. It is not perfect though and private data and logins are still an issue I believe.

1 Like


Not trying to argue one way or another here. Just trying to thoroughly understand KV behavior.



-I don’t think that’s the case, given writes are centralized.-

Edit: I was definitely not correct, if you are writing the same key you absolutely have a race.



I’ve done a few tests now, it seems to take ~100-200ms after a write can be read when doing it async.

However, if you use await for the write, it’s always immediately available on the worker from where you write the value - this is really good, because we can then reliably display the data to the user directly.

But it’s still globally distributed, so other nodes will probably not have the data - so maybe not suitable for real-time chat app but we can render webpage content or similar.

EDIT: I’ll do a test with nodes far apart later to see how it behaves.



@thomas4 Thanks!

1 Like