Modifying HTML with Workers while Streaming

Hi, I currently am developing a Cloudflare Edge Worker script that evaluates a series of instructions to modify a page’s HTML body. I was reading about the Streams API that can be used in workers, and the text following the example mentions “This example just pumps the subrequest response body to the final response body; however, you can use more complicated logic, such as adding a prefix or a suffix to the body or to process it somehow.”.

I was wondering: Is it possible to modify the HTML body while using Streams? If the content is being modified in chunks, then what happens if the HTML element that is being modified is split between two chunks? Would the appropriate approach to take in that case to be buffering the chunks to handle that case?

Any tips or examples would be appreciated. Thank you!

Hi,

It’s possible to modify the content, but you have to be careful with how large the pages are as you might bump in to the cpu limit. We use regex to modify the content of the chunks and it works fine up to a couple of MB’s.

Here’s an example of how to modify the content in the chunks:

async function streamBody(readable, writable) {
  const reader = readable.getReader();
  const writer = writable.getWriter();

  const encoder = new TextEncoder();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();

    let resultArray;

    if (done) {
      break;
    }

    const text = decoder.decode(value);
    const result = doMagic(text);      
    resultArray = encoder.encode(result);

    await writer.write(resultArray);
  }

  await writer.close();
}

Just add an implementation of the doMagic function :slight_smile:

1 Like