Range requests and the cache API

pier · April 23, 2021, 9:49pm

I’m trying to figure out how to use the cache API in workers with ranged requests.

The documentation is a bit confusing.

First it says that caching a response with a 206 status is considered invalid and will result in an error.

cache.put throws an error if… the response passed is a status of 206 Partial Content

And then it says:

Our implementation of the Cache API respects the following HTTP headers on the request passed to match() :

Range: Results in a 206 response if a matching response is found. Your Cloudflare cache always respects range requests, even if an Accept-Ranges header is on the response.

Isn’t this conflicting information?

mark-hatch · April 27, 2021, 3:35am

@pier I don’t believe this is conflicting. An error is thrown if the response from a fetch is a 206 and you try to cache it. But if it’s not a 206 (i.e. not a range response), then you can cache it and future requests that ask for a range from the cached content will succeed. I am curious, however, how Cloudflare recommends taking a range request for uncached content, asking the origin for, caching the full content from the origin, and also responding to that request with a range at the same time.

pier · April 27, 2021, 3:52am

I’d like to know too.

AFAIK once you’ve cached something, you can only return it as-is after doing match(). At least that’s what all the Cache API examples do.

mark-hatch · April 27, 2021, 4:31pm

Currently we’re handling it like this, but I don’t think that’s the best. We’ve reached out to Cloudflare for assistance with this.

pier · September 14, 2021, 7:50pm

Hey Mark are you still using that solution?

Were you able to get any assistance from Cloudflare on how to properly cache range requests?

pioty · July 24, 2022, 1:37pm

For anybody working on this, I was also very confused about how the cache could work. It turns out that the docs are right and accurate but it took me a while to get it.

In short:

The cache API can handle range requests and will return valid 206 responses if there is a match and it’s a Range request
The cache API can’t store partial responses. In all cases, the full response should be stored (only one write) and then let the Cache API handle partial requests in the future

This is how I’m doing it in a worker that handles requests for a R2 bucket:

const RANGE_CAP_BYTES = 2000000;

/**
 * Firefox and Chrome will first request bytes=0-
 * Safari will first request bytes=0-1
 *
 * According to MDN https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range
 * "The server can also ignore the Range header and return the whole document with a 200 status code."
 * Firefox and Chrome will respect that, and the header `bytes=0-` suggests that it's up to the server to either send whole document or a range
 * Safari, however, ignores the standard and will only accept a Ranged response of 2 bytes to check the full length first. It won't accept 200 with full file
 *
 * The part where Safari goes crazy is in the second request, as for small files it will ask from byte 0 again, even though 2 are sent
 * Firefox and Chrome respect bytes already sent, and continue paginating
 */
function parseRange(
  encoded: string | null
): undefined | { offset: number; length?: number } {
  if (encoded === null) {
    return;
  }

  if (!encoded.startsWith('bytes=')) {
    throw new Error('range units must be bytes');
  }

  if (encoded.includes(',')) {
    throw new Error('only single range supported');
  }

  const parts = encoded.split('bytes=')[1]?.split('-') ?? [];
  if (parts.length !== 2) {
    throw new Error(
      'Not supported to skip specifying the beginning/ending byte at this time'
    );
  }

  const offset = Number(parts[0]);
  const length = Number(parts[1]);

  /**
   * Don't return length if browser didn't send it
   * e.g.:
   * - first request from Firefox/Chrome with Range=0-
   * - subsequent requests from Firefox/Chrome with Range=100-
   *
   * This prevents JS bad parts, as `Number('') === 0`, and 0 is not a valid length
   */
  if (!length || Number.isNaN(length)) {
    return { offset };
  }

  return {
    offset,
    length: Number(parts[1]) + 1 - offset
  };
}

export async function handleRequest(event: FetchEvent): Promise<Response> {
  const { request } = event;
  let reqHeaders = request.headers;
  let resHeaders = new Headers();
  const origin = reqHeaders.get('Origin');

  if (request.method !== 'GET') {
    return new Response('Method not allowed', { status: 405 });
  }

  const cache = caches.default;
  /**
   * Try cache
   * Cloudflare Cache will handle Range if any given in request and return 206
   */
  const cacheHit = await cache.match(request);
  if (cacheHit) {
    return cacheHit;
  }

  /**
   * Fetch object
   */
  const { pathname } = new URL(request.url);
  const fileName = pathname.slice(1, undefined);
  const obj = await R2_BUCKET.get(fileName);
  if (obj === null) {
    return new Response('Not found.', { status: 404 });
  }

  // signal that Range requests are accepted
  resHeaders.set('accept-ranges', 'bytes');
  // pass any http metadata from r2 object to response headers
  obj.writeHttpMetadata(resHeaders);

  // handle response cache
  resHeaders.set('etag', obj.httpEtag);
  const cacheControl = resHeaders.get('cache-control');
  if (!cacheControl) {
    resHeaders.set('cache-control', 'max-age=2592000');
  }

  const range = parseRange(reqHeaders.get('range'));

  /**
   * RETURN FULL FILE
   */
  if (!range) {
    const response = new Response(obj.body, {
      headers: resHeaders
    });

    event.waitUntil(cache.put(request, response.clone()));

    return response;
  }

  /**
   *
   * RANGE REQUESTS
   *
   */
  const buffer = await obj.arrayBuffer();
  // cache the full file so that subsequent Range queries can be served
  event.waitUntil(
    cache.put(
      request,
      new Response(buffer.slice(0), {
        headers: resHeaders
      })
    )
  );
  const { offset, length } = range;
  const isSmallFile = obj.size <= RANGE_CAP_BYTES;

  /**
   * Handle first Range request from browsers like Firefox and Chrome where:
   * - a full 200 response is allowed
   * - the file is small
   */
  if (offset === 0 && isSmallFile && !length) {
    resHeaders.set(
      'content-range',
      `bytes ${offset}-${obj.size - 1}/${obj.size}`
    );
    return new Response(buffer, {
      headers: resHeaders,
      status: 206
    });
  }

  /**
   * Only return the bytes requested if full Range
   * e.g.: Safari for all requests
   */
  if (length) {
    resHeaders.set(
      'content-range',
      `bytes ${offset}-${offset + length - 1}/${obj.size}`
    );

    return new Response(new Uint8Array(buffer, offset, length), {
      headers: resHeaders,
      status: 206
    });
  }

  /**
   * Return the rest of the file for subsequent requests where:
   * - a full 200 response is allowed
   * - the file is small
   *
   * This will be the case for non-Safari browsers that canceled a request, or connection broke
   */
  if (offset > 0 && isSmallFile) {
    resHeaders.set('content-range', `bytes ${offset}-${obj.size}/${obj.size}`);
    return new Response(new Uint8Array(buffer, offset), {
      headers: resHeaders,
      status: 206
    });
  }

  /**
   * Return Partial Response for large files
   */
  resHeaders.set(
    'content-range',
    `bytes ${offset}-${offset + RANGE_CAP_BYTES}/${obj.size}`
  );
  return new Response(new Uint8Array(buffer, offset, RANGE_CAP_BYTES), {
    headers: resHeaders,
    status: 206
  });
}