Searching or querying using KV store as backing storage

Hi everyone - this might be a bit forward-looking, but may be necessary for a use case I want to solve involving “querying” to find it a given criteria is met/not met and then deciding on the execution of a method.

Is there any way to use KV store as a backing storage to run actual querying (not SQL, per se) and/or search using a DSL of some sort (something like ElasticSearch)?

Maybe @KentonVarda knows this off the top of his head.

Most higher-level databases are built on top of some sort of KV store, but there is usually quite a lot of higher-level logic on top.

Additionally, at present, KV is limited by the fact that it provides only eventual consistency with last-write-wins semantics. This means there’s no way to perform transactions on it; concurrent modifications by multiple clients may clobber each other and there’s no way to prevent that. We’re working on better primitives for storage and coordination that will solve this in the future.

However, if your data is read-only, or is only written by one central source (but read globally), then it would certainly be possible to build any kind of indexing on top of KV. But depending on your needs it could be a big project.

2 Likes

Thanks @KentonVarda @sdayman

Instead of re-inventing the wheel here - do you think that a data storage layer/adapter could be written for something open source like say ElasticSearch (which already exists/is mature) such that your KV storage happens to be the storage layer, but ES runs all the other functionality on top?

The bigger goal would be such a data layer for something like Postgres, but that’s a different proposition. I want all the benefits of the data access layer to be distributed and global - but all the functionality that already exists in database and search platforms which are decades old and well-tested.

The big issue is of course - eventual consistency, making race conditions difficult to resolve - unless all writes were queued and Cloudflare could confirm global availability for a write.

Google’s Cloud Spanner would be a great goal to aspire to, since primitive KV is highly limited.

Note - it’s great to have global low-latency access - but most actual apps need much more than that - databases and search in particular.

1 Like