Using Durable Objets for Assets - Fast Deployment - Good or bad idea?

Using Durable Objects for Assets

To enable Fast Deployment and Fast Development XP

Create
Durable Objects can be used to store « files », along with their headers. It is a Strong Consistency Asset Storage at edge. When the Worker is called to put or get the file in Durable Objects, it gets a file_id using env.files.idFromName(« a8ef-menu.js »). Once the file is created, the Worker puts the file in the edge cache. In edge cache, file is subject to eviction, anytime. If the file is a small PWA asset, the Worker also puts the file it in its memory (128 MB), using LRU algorithm.

Get
Read from edge cache, or from Worker’s memory, or from Durable Objects.

Delete
Feature Request : Files can be purged automatically or manually. Ex : User leaves service before GDPR retention clause. Currently, one way for Durable Object to emulate a delete is to storage.put(null), then respond 404 to future requests. But sadly the Durable Object itself still exists. Ideally, the storage attached to Durable Object accepts a TTL, a storage.delete() method, and Durable Object accepts a global delete method : env.files.delete(env.files.idFromName(name));

Why assets deserve Strong Consistency
Your new index.html is uploaded. It loads all of its new dependencies (JS modules, CSS, images…). These dependencies must exists. They must be immediately available. Not in 60s. Now ! Your whole PWA is released as a Blue / Green deployment. Everything or nothing. A generic importmap tracks version change of your files.

Fast Deployment
You upload the new static files. Optional steps : You upload the new worker. You destroy the cache for static pathnames (/, sw.js…) if they must deliver a new version. Your new PWA is available. In less than 5 seconds. You can test your PWA online as fast as you code, while still having control over your custom build process.

2 Likes

Personally I haven’t yet had the chance to get my hands on Durable Objects to see how they exactly work, so I’m not too sure if Durable Objects would be best for storing assets.
I have noticed recently though that the CEO of Cloudflare has said that Workers will receive some new database features in the future, which should be more similar to a real distributed database without the possible 60 second delay.

Unfortunately the original tweet/question about this got deleted, but you should be able to still see a part of it in my previous post below when you view it in it’s original thread by clicking on it:

3 Likes

DO for assets is for me “over-engineering”, possibly very expensive(pricing still unknown).
I use my own KV solution that is subsecond and is strong read-after-write consistency, it is not that fancy, but works.

DO will fit more inventory dependent transactions and messaging.

1 Like

Then please explain your solution :slight_smile:

You can use the origin + worker to do that:

  • Check if KV/Globals cache exist, if not…
  • Fetch file from origin in the worker
  • Write/Read the file to/from a global variable (for immediate cache)
  • Write the file to KV for “real” cache

The file will then be immediately available until the KV becomes active globally.

This will of course result in a few un-cached requests to the origin store, but it doesn’t need to be hard to scale, you can just put the file on S3.

Using KV for cache would not bring much since cached data resulting from assets is by nature volatile and public. The private data (user data) will just be end-to-end encrypted.

The edge cache can cache these files cheaper and faster than KV.

Using an external origin (S3, GCS) surely ensures the strong consistency.

I was looking for a solution internal to CloudFlare, in order to :

  • reduce complexity (write requires oAuth2 on GCS)
  • reduce bandwith cost (read/write)
  • improve performance (read/write)

These goals are achievable only if both the worker and the strong consistency storage reside on the same network.

I think Durable Objects might replace S3 & GCS if you create a DurableObject class, call it “File” (or whatever), then give it the possibility to in/out { content + headers }. These are for example the basics of GCS. The only missing part is the backoffice GUI where you can list and search the files, but let’s say it’s a first step.

1 Like

HI denis.

What you want achieve is something a lot of people want achieve too i think including myself.

My understanding of Durable Objects is that is a solution mainly for People who want
Atomic Writes and Reads aka Atomic Transactions which is not Possible with KV at the Moment.

Having Durable Objects solves the Problem with Workers to be able to do Atomic Writes and Reads for sure but not the problem with providing Low Latency Fast Sync Dynamic Data Delivery.

For having a Low Latency Fast Sync Dynamic Data Distribution i think Durable Objects maybe not the best Solution at the moment.

One Problem is that the Read Latency from Location 2 where the Durable Object is not Located to the Location 1 where the Durable Object is Located will be significant higher than serving this data from the Origin Server using ARGO Smart Routing to the oposite end of the Globe.

The only Way to reduce this Latency Gap with the Durable Objects would be Having a Master Slave Sync Server Durable Architecture where you write to the Master Location for atomic Writes but you are able to read from Multiple Durable Object Locations that are Synced from the Master.

Dont think the Master Slave Architecture for Durable Objects exist at the Moment.

If it would exist it would be huge imprement and i think lot of People would use it.

So it all depends at the end how fast reads from Workers can be done from POPs at the oposite end of the Globe to the Durable Object Location.

Till yet i dont have exact numbers how fast this reads are.
If they are 200 ms or bigger i would consider them slow.
50 ms would be great.

We will see.

2 Likes

That’s something most desire, but may not be valuable enough for Cloudflare to pursue.

Afaik, Durable Objects are geared to work only inside of Workers, so the Worker request cost will still exist.

Yes, every write or uncached read implies Worker Request + Durable Object Request.

CEO of Cloudflare answers : “we have a very fast network” :sweat_smile:

Their Network is for sure Fast and Amazing
But this Durable Objects are not located where the Workers are and you can choose only one Location.

I just studied the last Days Durable Objects more and my understanding till yet is that Durable Objects
exist for moment in Europe as location and maybe West USA.

You will have to choose one of this Location and see how fast the Worker Request from the opposite
end of the Globe are.

Personally i think it my be not that fast as you think !

A worker when he try to make a Request to a Durable Obkect it will need do a DNS Lookup, will need do SSL work and at least DataBase Lookup work sutff.

I have seen Number around 150 ms here in forum but not sure.

For Atomic writes and Reads for sure Great but not for Low Latency Dynamic Content Delivery.

I have yet to see official Numbers how fast this request from around the world.

Yes, using Juridictional Restrictions on Durable Objects or not, the question of the latency in the worst-case scenario remains. We need Real User Metrics (read & write).

Object has no size limit

put(entriesObject) Promise

  • Takes an Object and stores each of its keys and values to storage. Each value can be any type supported by the structured clone algorithm, which is true of most types. Supports up to 128 key-value pairs at a time. Each key is limited to a max size of 2048 bytes and each value is limited to 32 KiB (32768 bytes).

The object has no size limit, but you need to split it in N parts of 32768 bytes in order to be able to store it entirely.

You’ll have to convert the file to a Blob, then use blob.slice().

Something like this (not tested) :

// Slice a file into chunks
const slice = async (v, type, chunkSize) => {
  const blob = new Blob([v], { type });
  const a = [];
  const size = blob.size;
  const max = Math.floor(size / chunkSize);
  for (let i = 0; i < max; ++i) {
    a.push(blob.slice(i * chunkSize, (i + 1) * chunkSize);
  }
  a.push(blob.slice(max * chunkSize, size);
  blob.close();
  return a;
}
// Put file into storage
const put = async (k, v, type, storage) => {
  // Not sure of indexes so I minus 1
  const chunkSize = 32768 - 1; 
  const a = await slice(v, type, chunkSize); 
  let c = 0;
  let b = {};
  const p = [];
  for (let i = 0; i < a.length; ++i) {
    // Index in `${k}-0` the numbers of chunks & metadata (type, etc)
    b[`${k}-${c++}`] = !i ? JSON.stringify({ length:a.length, type }) : a[i];
    // Once we reache 128 values, we push values into storage
    if (i && i % 128 === 0) {
      p.push(storage.put(b));  
      b = {};
    }  
  }  
  p.push(storage.put(b));    
  await Promise.all(p);
}
// Put file into storage
await put('myKey', 'myValue', 'application/javascript', storage);
// Get file from storage
const get = async (k) => {
  let c = 0;
  let a = [];
  const {length, type} = JSON.parse(await storage.get(`${k}-${c++}`));
  let b = [];
  let p = [];
  for (let i = 0; i < length; ++i) {
    b.push[`${k}-${c++}`];
    // Once we reache 128 values, we get values from storage
    if (i && i % 128 === 0) {
      p.push(storage.get(b));
      b = [];
    }  
  } 
  p.push(storage.get(b));
  // Concat results
  a = [...a, ...await Promise.all(p)].flat();
  return {blob: new Blob(a, { type }), type};
}
// Get file from storage
const {blob, type} = await get('myKey');
// Send back blob & type to the calling worker
// Ouput it and it will be served under the given url

You can also add metadata tags in ${k}-0 like the cache control directive, …etc.

Reads, writes & pricing

Tensorflow.js (tf.min.js) is 880 kB which means 880 / 32 = 28 writes/reads. Yeah, 28 operations for a single file. Depending on the pricing model, it can cost a lot, like 28x the cost of KV if storage reads and writes are billed. If only the requests are billed, then price would be on par with KV. May be the 28 writes/reads can be mitigated by passing an array (128 keys max), so it would count as a single read/write ?

I’m really concerned by this.
The COCO ML machine learning model weights 12 MB for example.

TTL is on roadmap

Okay, so TTL is on the roadmap.
I hope the delete operations mentionned in the first post are also on the roadmap.

3 Likes

Have Durable Objects ever been meant to be used for storage of files? I interpret it rather as tool to enable write -> immediate read and store somewhere else, even in a KV so you can tell the download client to wait for a file you know will be available soon.

Yes, or you could figure a system where your worker could have an object {date1: importMap1, date2: importMap2} and given the date and the 60s delay, you load importMap1 or importMap2 files. The worker is intelligent and knows which version of the app to load.
I’d say it in some cases it would be ok for production, if 60s is a delay you can afford.
It depends on business if real time is necessary.

These mechanics does not solve the immediate availability.
You still have the 60s delay.
If you are in development, a 60s delay between 2 tests is huge.
It’s like you are waiting for a C++ compiler.
But you just need to preview your CSS / JS change.
It’s kinda unacceptable.

It was just an example, you could store the files in any of the myriad of object storage systems available.

I’d rather Cloudflare add an object storage which we can combine with durable objects rather than shoe-horn in a “solution” which it isn’t designed for.

2 Likes

Agree.

1 Like

While there’s nothing stopping you from using DO for asset storage/deployment, it is indeed going to be slower from many parts of the world than just using KV, and probably more expensive.

KV’s consistency can be worked around – Workers Sites has been using KV for this purpose to good effect for quite a while now.

While you could concoct situations where it’d make sense, in general I’d suggest making sure the benefits outweigh the costs before choosing DO over KV for a project like this.

DO slower than KV (for assets) ?

Correct me If I miss something :

If I follow the strategy behind Workers Sites, from this technical blog post :

This now allows us to, after first read per location, cache the static assets in the Cloudflare cache so that the assets can be stored on the edge indefinitely. This reduces reads to KV to almost nothing

Edge KVs are never called

Given the details, Workers Sites appears to be a Worker doing a one-time call a central KV instance to get the asset, then put the asset in the edge cache. The edge cache will handle subsequent loads.

In this configuration, the edge KVs are never called, and we just do a single, unique, one-time call to the central KV.

central KV instance VS DO edge instance (other side of the world)

In a rough order of magnitude, calling a central instance is at least as slow as calling a edge instance on the other side of the world. I’d say there is no benefit from one solution or the other in terms of latency.

Amazon guy says KV is slow

In addition, some random people from Amazon (yeah, I know, never trust Amazon) say that getting values from the central KV instance is quite slow :
https://mpasierbski.com/post/2019-11-17-cloudflare-workers-kv-caveats/

taking as long as ~500ms at times from Zurich edge location. It was not that big of a problem on North America, so we reached out to support asking about the slow response times: “Depending on where in the world the request is coming from, the request time for a cold-start is on average about 100-300ms - the storage is held in the central US”. This adds extra 300ms to TTFB (Time to first byte) which is pretty bad.

If I may complete the testimony, Cloud Storage is encountering same responses times for both reads and write internally using APIs, but when you try to reach the value, for read, over the LB/CDN, even the first time and without cache, you get the expected new version of the value, and it is blazing fast, like 5-20ms. I suppose Cloud Storage APIs are slow and the LB/CDN is on steroids.

The only drawback for hitting the LB/CDN is that values are publicly accessible. If you store private values, just encrypt them.

So until we get blazing fast reads from a Cloudflare storage solution, for now GCS is quite the solution, consistent writes at 400ms and blazing fast reads over the CDN at 5-20ms. The Load Balancer, running 100% of the time, comes at a cost, 20 EUR per month. Plus the bandwidth of the CDN, which is affordable but not free (VS Cloudflare free bandwidth).

That’s why I wanted to try something with Durable Storage, because the CTO claimed having a “very fast network”. And maybe it could overcome the cost of using 2 Cloud providers, which is cumbersome for such basic Cloud operations.

KV’s consistency ?

I don’t see how, unless we can ensure in the code that we are using the central instance (that I suppose consistent) rather than an edge instance (that are eventually consistent and might not have the latest assets).

Like :

// Get value from central instance ; bypass edge KV instance
const central = true;
NAMESPACE.get(key, central);

If we can ensure to hit the central instance, somehow by a boolean, I’m interested in the solution. It won’t solve the latency, but at least it would open scenarios & alternatives.

Cost ?

Well, I guess we’ll be delighted by the pricing in a couple of months :slight_smile:

That’s true in the abstract, but not if that “central” location is multi-homed in different parts of the world, such that reads don’t all have to go to a single point but can rather be served from the closest home.

I suspect you’ll find that cold read response times from Europe have improved significantly compared to when that post was written last year.

Workers Sites achieves this via versioning and write ordering. To my understanding, each newly deployed version of a site first uploads all of its assets, then uploads the manifest that points to the assets, then starts using the new manifest (and thus all the new referenced assets).

1 Like

So it’s a multi-regional central instance.
Is this central instance hosted on a single continent, or on multi-continents ?

For comparison GCS stores data on a single continent

I suppose you reference this post :

I’ll give a try.

So Workers Sites knows when the assets are reliably available on KV.
And it hits the central instance only one time, implying the central instance is expected to be strongly consistent. (Just like Cloud Storage). Am I right ?

if it is the case then there no need to build asset stuff over DO.

Do you know if there is a source code somewhere so I can check what’s going on during the deployment ? And check what are the assumptions taken by Worker Sites ?