Global, decentralized leader selection with Workers KV, was: Fetch() with a client certificate?


The typical solution in distributed computing is to have a leader election. Basically, at exp - (40 * 60) all Workers detecting the pending expiry register a unique ID in the KV store. (This requires listing a subset of keys, which Workers KV api can’t do yet). e.g.

/* NOTE: Horrible pseudo-code :) */
value = /* microsecond-level current timestamp */
hash = HASH(rayid + value)
key = "election/" + token_exp_ts + "/" + hash
KV_STORE.put(key, value, /* expire in 1 minute */)

/* wait for KV to propagate */

/* grab the list of registered candidates */
var candidates = KV_STORE.keys("election/" + token_exp_ts + "/*")

/* sort the list on their recorded timestamps  (KV_STORE values) */

/* if there's a tie for first, break the tie by alphabetical sorting of hash (KV_STORE key) */

/* If I'm the winner, go renew the token */

Each worker sorts the list on timestamps, breaking a tie for first by selecting the first alphabetical guid (hash of rayid + timestamp). The winner is the “chosen one”. That worker then proceeds to renew the JWT.


  1. This depends on CF:
    a. guaranteeing global sync of KV within 10 seconds
    b. providing a KV API call to list keys or a subset thereof
    c. permitting Worker scripts to live for at least 12 seconds wall clock
    d. CF maintains fairly accurate clock sync across it’s network (NTP-level accuracy)
  2. No security guarantee, malicious Worker could write a fake timestamp to almost guarantee selection
  3. remote possibility that HASH(rayid + timestamp) might collide. Could be mitigated by including the x-real-ip in the content of the hash…

[edit]: s/25/40/g, removed note about Workers presuming non-candidacy after a certain window. That note presumes a large amount of traffic. This needs to work even with one connect per week (or, at least one connect per time greater than the JWT validity window)


That works already, 15s as far as I remember, but was supposed to be increased.


Yes, and I can probably (ick) call the KV web api to list the keys until such time as CF adds that call to the Workers KV api.


Also from within the Worker itself, that should work…