R2: Multipart uploads not working with Minio client

mc is an S3 compatible client from Minio. However, when doing multipart uploads using it, it returns an error saying The XML you provided was not well formed or did not validate against our published schema..

I tested the same file using the AWS CLI and it seems to work.

Here is some context:

I was testing couple of S3 compatible storage providers, R2 being one of them. I have found mc, the Minio client, to be a more natural tool to interact with any S3 compatible API.

I was testing uploads of larger files which would trigger a multipart upload. I tested against S3, Backblaze B2, and then finally R2. However when I got to testing R2, I was getting the following error from mc:

Failed to copy `**REDACTED**/2022-01-01T05_00_00Z_PT1H.parquet`. The XML you provided was not well formed or did not validate against our published schema.

After digging a little into mc and running mc with debug logs, I found that the error message is returned by R2 along with a 400 Bad Request:

Request

POST ****/2022-01-01T05_00_00Z_PT1H.parquet?uploadId=ABrSSDpJ9qd09qLB27UAKz%2BMCaCGxKVHtOiVxCvlnNyz%2BAznC4TymYpV6WglQltBUwXMLEzHDCqwjz4ZZaW62gQ02eGm%2FtyB19A2qC%2FyaH2fL%2FEfSS0AMxG%2B60dHrUlaK2aopVS9zs6p62iJehgivRMWINhbX7OS12cXt%2BXlJqxFo8eodxn0fHCJrpaMtz7o%2BSAdAtmia5xWepMr6dXFNeZtpARYjydNvCMWmJWdDi%2BUDkbG2hFePsHLESZD7CXNgSCFGj30K33JrhwiImcfJivc4KomkS8XGigZqTf94Ityn%2FhuCIJKCaarT2HAsPgf4A%3D%3D HTTP/1.1
Host: ****.r2.cloudflarestorage.com
User-Agent: MinIO (linux; amd64) minio-go/v7.0.27 mc/RELEASE.2022-06-26T18-51-48Z
Content-Length: 5144
Authorization: AWS4-HMAC-SHA256 Credential=****/20220703/auto/s3/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
Content-Type: application/octet-stream
X-Amz-Content-Sha256: b0ff115848d1eade087bf77d66c70a30ff93af21b3dfa8ac8dc12d0af9dc74b1
X-Amz-Date: 20220703T101550Z
Accept-Encoding: gzip

Response

HTTP/1.1 400 Bad Request
Content-Length: 149
Cf-Ray: 724edcd00d693607-MAN
Connection: keep-alive
Content-Type: application/xml
Date: Sun, 03 Jul 2022 10:15:53 GMT
Expect-Ct: max-age=604800, report-uri="****"
Server: cloudflare

<Error><Code>MalformedXML</Code><Message>The XML you provided was not well formed or did not validate against our published schema.</Message></Error>

To check if the XML sent as part of the request was malformed or not, I modified mc to output the XML before sending it to R2. I found the XML that was triggering the 400 was a CompleteMultipartUpload message. After comparing with the AWS docs for this message, everything seemed to match up. Here is part of the XML:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>1</PartNumber>
    <ETag>AC1GJQgYfVqoSXiBAbsceB7AGHhTrGQo5yxb/gKRPa5U8mFjEJ8mY5yXCSWXVCgnSC3htAhdtH4PPIBuk43sZRwyZ2NMI+jdBp277XB6iGWa7gokwzP+5D+qOT3F6yfvoTdlhMCAHlUuZGXLCGMLAaB9nMdUdM40xJenddNWb/PdGI5Ywn5IKKsRFrb4Xwnh85SMp90I7LZxpGuy6J3mqSdYSuqhPh4MndcnHDd4aS6EXmm9lfm2hnISFXgz4Pngog==</ETag>
  </Part>
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>2</PartNumber>
    <ETag>APGSqyqEDKN2bG68x/4CykDaifv9wit9RVwXTvC10MDGu+6gXviP4ws1I/SOqb8unzH0WM3QaELjbKshCTw02HA5N6T7vjQ/bsHwAon1dfC9Q4t7S/fis/LTjlY55IIE5fzsSAZPR8BN3T7LdADDiSO7HomrlhTlX5sgajRq0CA1JmP+aqFQENouDpi/U2vvZsrUWxwTxIUMV3pjwdPdJ1xs+V/kU7qT0hCIpO9cItI69ZzrzLx0/M8IdYKlRDYwDw==</ETag>
  </Part>

...
</CompleteMultipartUpload>

At this point, I decided to try uploading the same file using the AWS CLI. This worked as expected. After inspecting the difference between the XML for CompleteMultipartUpload sent by mc and aws s3, I noticed that the text body of ETag was quoted by aws s3 and left unquoted by mc.

To make sure this was the problem, I modified mc to quote the body of ETag and tada, it worked! Here is the XML that worked:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>1</PartNumber>
    <ETag>"AOCRehBvt+Esfw9HEvgxgFDspcj1uubiJbAqFcY5u6iv0qBikr57n0Qs7FvHR10LBaoIk7VO4y9GCw1GtbaOUgaT+aViejBuJf04w3vOBq0doF3H0SeZFYHkAqWhcX7QUGiwAc+k8R9nP+WVMXSiXGhzIh1jUX/oRNOfp5cQt951JFxuu9G/ft28dajy7HOb7DXoSBSiwqT6hscj3mqQ3N1XvnuHS71zEwAhnxSpAc392hGr6SH6U+41ssuUeM7bKw=="</ETag>
  </Part>
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>2</PartNumber>
    <ETag>"AMomqON9e7lrEVVHugMeVOLX74XWpe/CIKabpT8Xbg34iYWfPXUNI/Dz2OfPiswj5E/ZrdCSmGg1nH1lrZKy1mrWD/LP03enTcKkb14wiV4I5UQ2J2HIYX427fYGsIM6yAWWPasz/Py1xWQmTA7ZLtaGnC6DaMX1dBnuFHk3FfQjxLU3evFmHA/bC3Ci06foa8rm7ygFetoTGq9CEbUBBnFjDVj2sOQSmKfGBuAsxKLPQrK6vrnthLPbbo9cHg1jaQ=="</ETag>
  </Part>

...
</CompleteMultipartUpload>

I’m still scratching my head over at least one issue. I use Transmit on the Mac (now discontinued, I believe). Works great on all S3-compatible services I use…except R2. Same connection using rclone works fine.

The point being that R2 seems to have some subtle differences that don’t work for some clients.

Their schema might be the one in DevDocs:

2 Likes
1 Like

Thanks! Now that I look into it, it was Transmit for iOS that’s discontinued. Shame. Desktop client is still awesome, though.

Correction:

Doesn’t work

<ETag>abcdef</ETag>

Works

<ETag>"abcdef"</ETag>

Note about the XML above:

To check if the XML sent as part of the request was malformed or not, I modified mc to output the XML before sending it to R2. I found the XML that was triggering the 400 was a CompleteMultipartUpload message. After comparing with the AWS docs for this message, everything seemed to match up.

At this point, I decided to try uploading the same file using the AWS CLI. This worked as expected. After inspecting the difference between the XML for CompleteMultipartUpload sent by mc and aws s3, I noticed that the text body of ETag was quoted by aws s3 and left unquoted by mc.

To make sure this was the problem, I modified mc to quote the body of ETag and tada, it worked!

So here is the question: is this something that needs to be changed on the R2 side, it looks like other S3 implementations including S3 itself can handle the ETag body without needing quotes, mc works fine with these other providers. On the other hand, I could raise this issue with mc and maybe get the change for quoting the ETag body upstreamed, but I feel skeptical about the latter?

Technically according to the S3 spec, the etag is supposed to be quoted. I’ll get this patched up to support mc’s nonstandard usage that works with S3.

4 Likes

Note I am creating this post hoping that someone that works on R2 at Cloudflare finds this helpful.

In this post **** = redacted.

Context

I was testing couple of S3 compatible storage providers, R2 being one of them. I have found mc, the Minio client, to be a more natural tool to interact with any S3 compatible API.

I was testing uploads of larger files which would trigger a multipart upload. I tested against S3, Backblaze B2, and then finally R2. However when I got to testing R2, I was getting the following error from mc:

Failed to copy `****`. The XML you provided was not well formed or did not validate against our published schema.

Debugging

After digging a little into mc and running mc with debug logs, I found that the error message is returned by R2 along with a 400 Bad Request:

Request

POST **** HTTP/1.1
Host: ****
User-Agent: MinIO (linux; amd64) minio-go/v7.0.27 mc/RELEASE.2022-06-26T18-51-48Z
Content-Length: 5144
Authorization: AWS4-HMAC-SHA256 Credential=****, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=****
Content-Type: application/octet-stream
X-Amz-Content-Sha256: b0ff115848d1eade087bf77d66c70a30ff93af21b3dfa8ac8dc12d0af9dc74b1
X-Amz-Date: 20220703T101550Z
Accept-Encoding: gzip

Response

HTTP/1.1 400 Bad Request
Content-Length: 149
Cf-Ray: 724edcd00d693607-MAN
Connection: keep-alive
Content-Type: application/xml
Date: Sun, 03 Jul 2022 10:15:53 GMT
Expect-Ct: max-age=604800, report-uri="****"
Server: cloudflare

<Error><Code>MalformedXML</Code><Message>The XML you provided was not well formed or did not validate against our published schema.</Message></Error>

To check if the XML sent as part of the request was malformed or not, I modified mc to output the XML before sending it to R2. I found the XML that was triggering the 400 was a CompleteMultipartUpload message. After comparing with the AWS docs for this message, everything seemed to match up. Here is part of the XML:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="">
  <Part xmlns="">
    <PartNumber>1</PartNumber>
    <ETag>AC1GJQgYfVqoSXiBAbsceB7AGHhTrGQo5yxb/gKRPa5U8mFjEJ8mY5yXCSWXVCgnSC3htAhdtH4PPIBuk43sZRwyZ2NMI+jdBp277XB6iGWa7gokwzP+5D+qOT3F6yfvoTdlhMCAHlUuZGXLCGMLAaB9nMdUdM40xJenddNWb/PdGI5Ywn5IKKsRFrb4Xwnh85SMp90I7LZxpGuy6J3mqSdYSuqhPh4MndcnHDd4aS6EXmm9lfm2hnISFXgz4Pngog==</ETag>
  </Part>
  <Part xmlns="">
    <PartNumber>2</PartNumber>
    <ETag>APGSqyqEDKN2bG68x/4CykDaifv9wit9RVwXTvC10MDGu+6gXviP4ws1I/SOqb8unzH0WM3QaELjbKshCTw02HA5N6T7vjQ/bsHwAon1dfC9Q4t7S/fis/LTjlY55IIE5fzsSAZPR8BN3T7LdADDiSO7HomrlhTlX5sgajRq0CA1JmP+aqFQENouDpi/U2vvZsrUWxwTxIUMV3pjwdPdJ1xs+V/kU7qT0hCIpO9cItI69ZzrzLx0/M8IdYKlRDYwDw==</ETag>
  </Part>

...
</CompleteMultipartUpload>

At this point, I decided to try uploading the same file using the AWS CLI. This worked as expected. After inspecting the difference between the XML for CompleteMultipartUpload sent by mc and aws s3, I noticed that the text body of ETag was quoted by aws s3 and left unquoted by mc.

To make sure this was the problem, I modified mc to quote the body of ETag and tada, it worked! Here is the XML that worked:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="">
  <Part xmlns="">
    <PartNumber>1</PartNumber>
    <ETag>"AOCRehBvt+Esfw9HEvgxgFDspcj1uubiJbAqFcY5u6iv0qBikr57n0Qs7FvHR10LBaoIk7VO4y9GCw1GtbaOUgaT+aViejBuJf04w3vOBq0doF3H0SeZFYHkAqWhcX7QUGiwAc+k8R9nP+WVMXSiXGhzIh1jUX/oRNOfp5cQt951JFxuu9G/ft28dajy7HOb7DXoSBSiwqT6hscj3mqQ3N1XvnuHS71zEwAhnxSpAc392hGr6SH6U+41ssuUeM7bKw=="</ETag>
  </Part>
  <Part xmlns="">
    <PartNumber>2</PartNumber>
    <ETag>"AMomqON9e7lrEVVHugMeVOLX74XWpe/CIKabpT8Xbg34iYWfPXUNI/Dz2OfPiswj5E/ZrdCSmGg1nH1lrZKy1mrWD/LP03enTcKkb14wiV4I5UQ2J2HIYX427fYGsIM6yAWWPasz/Py1xWQmTA7ZLtaGnC6DaMX1dBnuFHk3FfQjxLU3evFmHA/bC3Ci06foa8rm7ygFetoTGq9CEbUBBnFjDVj2sOQSmKfGBuAsxKLPQrK6vrnthLPbbo9cHg1jaQ=="</ETag>
  </Part>

...
</CompleteMultipartUpload>

So here is the question: is this something that needs to be changed on the R2 side, it looks like other S3 implementations including S3 itself can handle the ETag body without needing quotes, mc works fine with these other providers. On the other hand, I could raise this issue with mc and maybe get the change for quoting the ETag body upstreamed, but I feel skeptical about the latter?

Sorry, this is a duplicate of R2: Multipart uploads not working with Minio client.

Note I am creating this post hoping that someone that works on R2 at Cloudflare finds this helpful.

Context

I was testing couple of S3 compatible storage providers, R2 being one of them. I have found mc, the Minio client, to be a more natural tool to interact with any S3 compatible API.

I was testing uploads of larger files which would trigger a multipart upload. I tested against S3, Backblaze B2, and then finally R2. However when I got to testing R2, I was getting the following error from mc:

Failed to copy `**REDACTED**/2022-01-01T05_00_00Z_PT1H.parquet`. The XML you provided was not well formed or did not validate against our published schema.

Debugging

After digging a little into mc and running mc with debug logs, I found that the error message is returned by R2 along with a 400 Bad Request:

Request

POST **REDACTED**/2022-01-01T05_00_00Z_PT1H.parquet?uploadId=ABrSSDpJ9qd09qLB27UAKz%2BMCaCGxKVHtOiVxCvlnNyz%2BAznC4TymYpV6WglQltBUwXMLEzHDCqwjz4ZZaW62gQ02eGm%2FtyB19A2qC%2FyaH2fL%2FEfSS0AMxG%2B60dHrUlaK2aopVS9zs6p62iJehgivRMWINhbX7OS12cXt%2BXlJqxFo8eodxn0fHCJrpaMtz7o%2BSAdAtmia5xWepMr6dXFNeZtpARYjydNvCMWmJWdDi%2BUDkbG2hFePsHLESZD7CXNgSCFGj30K33JrhwiImcfJivc4KomkS8XGigZqTf94Ityn%2FhuCIJKCaarT2HAsPgf4A%3D%3D HTTP/1.1
Host: **REDACTED**.r2.cloudflarestorage.com
User-Agent: MinIO (linux; amd64) minio-go/v7.0.27 mc/RELEASE.2022-06-26T18-51-48Z
Content-Length: 5144
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220703/auto/s3/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
Content-Type: application/octet-stream
X-Amz-Content-Sha256: b0ff115848d1eade087bf77d66c70a30ff93af21b3dfa8ac8dc12d0af9dc74b1
X-Amz-Date: 20220703T101550Z
Accept-Encoding: gzip

Response

HTTP/1.1 400 Bad Request
Content-Length: 149
Cf-Ray: 724edcd00d693607-MAN
Connection: keep-alive
Content-Type: application/xml
Date: Sun, 03 Jul 2022 10:15:53 GMT
Expect-Ct: max-age=604800, report-uri="****"
Server: cloudflare

<Error><Code>MalformedXML</Code><Message>The XML you provided was not well formed or did not validate against our published schema.</Message></Error>

To check if the XML sent as part of the request was malformed or not, I modified mc to output the XML before sending it to R2. I found the XML that was triggering the 400 was a CompleteMultipartUpload message. After comparing with the AWS docs for this message, everything seemed to match up. Here is part of the XML:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>1</PartNumber>
    <ETag>AC1GJQgYfVqoSXiBAbsceB7AGHhTrGQo5yxb/gKRPa5U8mFjEJ8mY5yXCSWXVCgnSC3htAhdtH4PPIBuk43sZRwyZ2NMI+jdBp277XB6iGWa7gokwzP+5D+qOT3F6yfvoTdlhMCAHlUuZGXLCGMLAaB9nMdUdM40xJenddNWb/PdGI5Ywn5IKKsRFrb4Xwnh85SMp90I7LZxpGuy6J3mqSdYSuqhPh4MndcnHDd4aS6EXmm9lfm2hnISFXgz4Pngog==</ETag>
  </Part>
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>2</PartNumber>
    <ETag>APGSqyqEDKN2bG68x/4CykDaifv9wit9RVwXTvC10MDGu+6gXviP4ws1I/SOqb8unzH0WM3QaELjbKshCTw02HA5N6T7vjQ/bsHwAon1dfC9Q4t7S/fis/LTjlY55IIE5fzsSAZPR8BN3T7LdADDiSO7HomrlhTlX5sgajRq0CA1JmP+aqFQENouDpi/U2vvZsrUWxwTxIUMV3pjwdPdJ1xs+V/kU7qT0hCIpO9cItI69ZzrzLx0/M8IdYKlRDYwDw==</ETag>
  </Part>

...
</CompleteMultipartUpload>

At this point, I decided to try uploading the same file using the AWS CLI. This worked as expected. After inspecting the difference between the XML for CompleteMultipartUpload sent by mc and aws s3, I noticed that the text body of ETag was quoted by aws s3 and left unquoted by mc.

To make sure this was the problem, I modified mc to quote the body of ETag and tada, it worked! Here is the XML that worked:

<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>1</PartNumber>
    <ETag>"AOCRehBvt+Esfw9HEvgxgFDspcj1uubiJbAqFcY5u6iv0qBikr57n0Qs7FvHR10LBaoIk7VO4y9GCw1GtbaOUgaT+aViejBuJf04w3vOBq0doF3H0SeZFYHkAqWhcX7QUGiwAc+k8R9nP+WVMXSiXGhzIh1jUX/oRNOfp5cQt951JFxuu9G/ft28dajy7HOb7DXoSBSiwqT6hscj3mqQ3N1XvnuHS71zEwAhnxSpAc392hGr6SH6U+41ssuUeM7bKw=="</ETag>
  </Part>
  <Part xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <PartNumber>2</PartNumber>
    <ETag>"AMomqON9e7lrEVVHugMeVOLX74XWpe/CIKabpT8Xbg34iYWfPXUNI/Dz2OfPiswj5E/ZrdCSmGg1nH1lrZKy1mrWD/LP03enTcKkb14wiV4I5UQ2J2HIYX427fYGsIM6yAWWPasz/Py1xWQmTA7ZLtaGnC6DaMX1dBnuFHk3FfQjxLU3evFmHA/bC3Ci06foa8rm7ygFetoTGq9CEbUBBnFjDVj2sOQSmKfGBuAsxKLPQrK6vrnthLPbbo9cHg1jaQ=="</ETag>
  </Part>

...
</CompleteMultipartUpload>

So here is the question: is this something that needs to be changed on the R2 side, it looks like other S3 implementations including S3 itself can handle the ETag body without needing quotes, mc works fine with these other providers. On the other hand, I could raise this issue with mc and maybe get the change for quoting the ETag body upstreamed, but I feel skeptical about the latter?

Sorry, this is a duplicate post of R2: Multipart uploads not working with Minio client.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.