Is it OK to store audio in the Cache API?

I know the TOS prohibit from using the regular CDN to cache audio, but is it OK to use the Cache API to do so?

After all, the Workers TOS explicitly say it’s OK to store audio in KV.

LGTM for audio.

  1. Cloudflare Workers is a Service that permits developers to deploy and run encapsulated versions of their proprietary software source code (each a “ Workers Script ”) on Cloudflare’s edge servers. Cloudflare Pages is a JAMstack platform for frontend developers to collaborate and deploy websites. You may use Cloudflare Pages and Workers (whether in conjunction with a storage offering such as Cloudflare Workers KV and Durable Objects or not) to serve HTML content as well as non-HTML content (e.g., image files, audio files) other than video files.

Yeah but that refers to KV and Workers, not the Cache API per se.

In the Workers docs it’s mentioned you can put as much as 5GB per request on the Cache API:

So I’m guessing storing 20MB of audio should be fine?

It should be fine.

Cache API can toss your data whenever it (CF) feels like it. Its also per-POP. Think of it as the free version of KV, but you randomly loose your data. CF under-advertises the power of Cache API vs KV/DO since KV/DO is pay-per-hit, Cache API is pay-per-month. Although workers are pay-per-hit sort of.

1 Like

AFAIK by using the cache API you’re not avoiding the Workers reqs but the KV reqs.

Something I found out is that for every CORS req Workers receive 2 reqs (OPTIONS + GET/POST/etc). Then another extra req if the data either comes from KV instead of the Cache API.

If you’re streaming audio (using fetch() and MediaSource) with range requests, every audio chunk costs 2-3 reqs. Depending on the chunk size and audio file length, you end up making dozens (even hundreds) of reqs per audio file. This really adds up once you start getting traffic.

I mostly agree, OPTIONS is 1st workers req hit, GET/POST is 2nd hit. KV reads are a separate quota (another $5 a month) from workers. You can always promote KV reads and put() to local cache API then clone() and dispatch the asset to client and to local CF cache, then in a future req always match() to local cache API before trying a KV read.

Getting rid of the CORS preflights is your top priority. I’ve had to refactor APIs to remove most preflights to workers when creating workers.

You have to ignore the 21st century design rule of no API-keys/no tokens in a GET URL string as query args rule. Its 2021, there are no exploited unmaintained forgotten HTTP proxies left, and if there are, they can’t decode SSL traffic. Other argument against tokens in URL strings are, containers/origins get hacked all the time, and 99% of servers keep basic event logs on disks, which “in theory” include full URLs (and tokens), but “in theory” those default event logs dont record full headers or POST bodies, so Authorization header is “safer” against a criminal running out the front door of your datacenter with your origin server’s SSD in his hand.

No tokens in GET URLs is pointless if a hacker is running his own x86 daemon or has root shell access 24/7 and LD_PRELOAD/pcap tapped your nginix process.

Also converting POST to GET is sometimes dangerous because GET is idempotent request concept. One API i’ve seen requires that state-changing GET requests include a client selected GUID/1 time hex string random client transaction number. Calling GET again with that identical client transaction number, de-duplicates server side. This deals with dropped TCP connections mid-HTTP request/time outs/wifi vs cell. Traditionally you would use POST to prevent replay. But with enough code, GETs can be also made anti-replay. But you are serving audio, not ecommerce transactions, its already idempotent.

GET also requires that your JSON POST body be base64 encoded into a query arg keeping in mind the 2 KB URL limit good practice.

If you have audio files, you need to rethink authentication to elimination your OPTION requests. Either the audio file name and chunk name is in a POST body, or X-whatever HTTP header, with Cookie or Bearer token auth, So 1 OPTION response with Access-Control-Max-Age per eyeball per week is needed to GET unlimited audio files for the rest of the week, or you need to make your audio file get requests into Cross-Origin Resource Sharing (CORS) - HTTP | MDN “simple request” compliant HTTP requests from your eyeball JS code. The OPTIONS req will include what fields are triggering the preflight. You might need to put Range header into a query string arg to make it a “simple request” and extract the Range back into a header in the worker or always include the audio files name in a header, not in the URL, all the GET requests have an identical URL (not browser cache friendly, but if you are streaming or have a client service worker it doesn’t matter). You might need to either use HLS formal or round to nearest power of 2 (100KB or 1 MB or whatever) all range requests for client cache or CF cache friendlyness.

I forgot another part, have your front end static HTML UI and your worker be same domain… or Iframe your UI so its same origin fetch. iframe on worker/CDN domain and postMessage onmessage is another cross origin with no CORS, no preflight trick. I dont have any experiance if postMessage will crash a browser sending massive binary loads through it.

1 Like