Currently I’m working with the binary encoded data that are stored in the KV store. As this is an initial phase I’m trying to find the best possible way to enable my worker to process as much as possible in the shortest possible time. Due to the stream nature of data, I took the following approach
- using
NAMESPACE.get(key, "stream")
to get the access to the first byte as soon as possible
- using BYOB reader with
stream.getReader({ mode: "byob" });
- reusing the
ArrayBuffer
for reading with the reader
.
The size of the value that I’m working with is ~10MB. I tried the following approach to measure how big a single chunk is and it looks like on average it’s ~1KB (being given the tight loop).
while(reads<max && result.value)
{
result = await reader.read(new Uint8Array (buffer));
if (result.value){
length += result.value.length
}
reads += 1;
}
I’m aware of the limitations mentioned in Common Issues of BYOB readers. Now my questions 
- Does this mean, that before
minBytes
parameter is introduced to the read
method, BYOB
should not be used for bigger blobs of data because it won’t process them in time?
- Should
arrayBuffer
be used then, even taking into consideration of the memory constraint (128MB AFAIK)?
- What are scenarios that fit BYOB then?
Not related but while trying to upload multiple sizes of images (responsive versions), I tried to write my own form data parser and reading bytes synchronously and sequentially in JS is quite slow. My custom form data parser itself taking 40ms, almost the whole CPU limit, I called it a failure for my own needs. After that, I came with another solution : I store an arrayBuffer, a concatenation of the images I want to upload. In the content-type header, I pass the bytes boundaries of each image and the width I expect for these images. I am now able to process all those images asynchronously, and the job is done in a reasonable amount of time. When a user asks for a peculiar size of the image, I do the reverse job, get the whole arrayBuffer from KV, plus metadata, then I use the metadata I stored along the value to retrieve the boundaries, the widths, slice the arrayBuffer and get the proper version of the image.
So maybe your problem can be solved at store time, by storing the relevant information in metadata along with the value itself.
1 Like
We really need more sample code of streaming methods, it’s been sparse and confusing ever since I started with Workers (2 years ago).
1 Like
@denis.truffaut I haven’t thought about using metadata, for… storing metadata
At the same time, in my case I use first bytes (let’s say kilobytes) of the payload to store it. Then, I make consecutive jumps based on the data already read. By jumps, I mean reading and then discarding buffers as there’s no method that allows you to Promise skip(lenght)
.
I’m still trying to make sense out of it. If streaming with BYOB cannot do, then I need to read it as arrayBuffer
. But then again, when BYOB with streaming makes sense at all if it saturates the limit for bigger blobs?
I think that I’d love to see the same @thomas4 . More examples that deals with a bit more complex case than just the identity transformation of piping to the output.
1 Like