Is there a difference in performance calling Worker KV js API functions vs. KV REST API fetch calls?

since there are no native functions for bulk/batch operations (write and delete) in the worker env to call, we’d need to implement them ourselves with a fetch call to the REST API, like how it’s done in this blog post:

async function bulkWrite(keyValuePairs) {
  return fetch(
    "https://api.Cloudflare.com/client/v4/accounts" +
    "/:cf_account_id/storage/kv/namespaces/:cf_namespace_id/bulk",
    {
      method: "PUT",
      headers: {
        "Content-Type": "application/json",
        "X-Auth-Key": ":cf_auth_key",
        "X-Auth-Email": ":cf_email"
      },
      body: JSON.stringify(keyValuePairs)
    }
  )
}
  • is it not recommended to have/hard code auth creds in a worker’s code?
  • are the currently supplied Worker KV js API functions implemented, work and perform exactly as the above function behind the scenes?
  • in the future, will there be bulk/batch operations functions in the Worker KV js API?

That isn’t an issue since the code is not user readable.

The difference as far as I know is the latency. The API hits the core cluster, not the edges as the Worker ones do. Functionality is the same, latency may not be.

I presume so, but @sklabnik or @signalnerve could confirm or deny. Don’t know that exactly myself.

is it not recommended to have/hard code auth creds in a worker’s code?

That isn’t an issue since the code is not user readable.

I strongly disagree here. While you are correct that the worker’s code isn’t readable, there’s more to it than that. While the V8 engine provides isolation, and some security guarantees, it doesn’t protect you from generic protocol abuse. e.g. I direct https://api.example.com/v1/fetch_val/* to my script. If my script takes the last component from the URL, and treats it as a variable, returning the value held in that variable, you just gave up the keys to controlling all aspects of your CF account. :frowning:

The example I list above isn’t meant to be realistic, rather, it’s just a naive example of how your API keys could be compromised without exploiting any code vulnerabilities. The more complicated the protocol, the more possible it becomes.

Ultimately, it’s a risk assessment by the developers / management / customer etc. KVSTORE.list() went a long way towards removing the most frequent need for such workarounds. But as you rightly mention, bulk reads and writes are still REST API-only. Which is unfortunate.

The potential failure modes are too high for my taste, so I always decide against it. CF has been very responsive to rolling out new KV features (like .list). So, hopefully, we won’t have to wait too long to get bulk read/write.

Thanks,

Jason.

Well, for one you should use a specific API key that can do only what you want it to do (there is a current public beta, as far as I know enabling that exactly) and second you should always check what you allow as incoming to the code…

Also this was a discussing about the specific issue of the API keys in the code, additional security measures are a different discussion altogether.

Not currently, although you can get the same behavior without needing to include any credentials in your script or wait on a network round trip to the central API by running something like:

var writePromises = [];
keyValuePairs.forEach((key, val) => {
  writePromises.push(kv.put(key, val));
});
await Promises.all(writePromises);

In general, we want to keep the API and the JS API at parity, but the JS API does lag behind. (See here for more on this)

We don’t have immediate plans to roll out bulk things in the JS API, but if people want it and have good use cases, we’re not opposed to adding it; we just have higher priority things to work on at the moment.

even awaiting 10000 promises for 10000 key to delete or key-value pair to write won’t exceed 50ms CPU time limit? every awaited promise doesn’t count towards CPU time limit? including const sleep = m => new Promise(r => setTimeout(r, m)); await sleep(10000) // 10s ?

that reply was very helpful, things are making more sense now.

I need leveldb like APIs like list and Bulk operations at edge with the js API to implement a Pouchdb on top of a LevelUP backend for workers KV at edge, could be called “workers-kv-down”, leveldb has very similar batch/bulk API.
Couchdb and Pouchdb encourage a db per user model and that’s what I’m planning to have as the sole backend for an App I’m developing, <60ms pouchdb sync latency is what I’m dreaming about at night.
give us HTTP API and edge/JS API parity and
takeMyMoney

2 Likes

I’m not sure whether you’d run into the CPU limit or not. The CPU involved in sending the requests and processing the responses does count, so 10k might be pushing it. But I’d also be unsure without trying it whether you could serialize 10k reasonably sized values into JSON in the request to the bulk API.

I asked specifically about 10k kv pair deletes/writes because that’s the limit with the HTTP API.

I definitely could have wrote it more clearly but I asked a second question in there, are there limiting factors to how much can a worker sleep (other than CPU time limit)? I’m asking that because I’m planning on using it to implement the syncing mechanism with server-sent events.

There is a limit of 50 sub-requests per execution, the API requests are included in this.

Not that I know of, but in the limits above it states total of 30s idle. I believe it was dropped, but the docs weren’t updated. Any info @cloonan @signalnerve or @KentonVarda (which I believe told me this somewhere…)?

1 Like

This is now GA, see here:

https://blog.cloudflare.com/api-tokens-general-availability/