Storing R2 object throws an error for some readable streams

When trying to store a R2 object, the following code works when the url is a binary object (like an image or pdf) but not for text:

const response = await fetch(url)
await env.MY_BUCKET.put('somekey', response.body)

For urls (eg https://google.com), I get this error thrown:
TypeError: Provided readable stream must have a known length (request/response body or readable half of FixedLengthStream)

Is that because the fetch response does not contain a Content Length header? What should I do in this case, catch the exception and try something like
env.MY_BUCKET.put('somekey, await response.text())
?
Better ideas?
Thanks!

That’d be my understanding of it personally.

Assuming you’re trying to fetch and store HTML, I’d fallback to text() like you said. However, you could check if response.headers.get('content-length') returns anything as opposed to catching an exception.

You’ll want to include the response headers you got in your put operation so that the content-type is correctly brought along, otherwise it’ll end up like this.

image

Pass them through to the options optional parameter of put as the httpMetadata object - like so…

let body;
if (response.headers.get('content-length') == null) {
    body = await response.text()
} else {
    body = response.body
}

await env.BUCKET.put('test', body, {
    httpMetadata: response.headers
});

image

Notably, httpMetadata is comprised of various content-* and cache-* headers.

interface R2HTTPMetadata {
  contentType?: string;
  contentLanguage?: string;
  contentDisposition?: string;
  contentEncoding?: string;
  cacheControl?: string;
  cacheExpiry?: Date;
}

If there’s any of these that you’d rather not bring along, then just specify them manually as opposed to passing the entire response.headers object.

Sorry for dredging up an old thread, but I’m running into this problem in my worker, too. The worker is fetching some data and trying to store it in R2. The endpoint it’s fetching from does not include a content-length header in the response, so I get the Provided readable stream must have a known length mentioned here.

The data I’m fetching is binary, so I can’t use .text(), and using .blob() to read it as binary results in the Network connection lost error mentioned in this other thread.

How can I solve this without using .text() as mentioned by @KianNH?