Note I am creating this post hoping that someone that works on R2 at Cloudflare finds this helpful.
In this post **** = redacted.
Context
I was testing couple of S3 compatible storage providers, R2 being one of them. I have found mc
, the Minio client, to be a more natural tool to interact with any S3 compatible API.
I was testing uploads of larger files which would trigger a multipart upload. I tested against S3, Backblaze B2, and then finally R2. However when I got to testing R2, I was getting the following error from mc
:
Failed to copy `****`. The XML you provided was not well formed or did not validate against our published schema.
Debugging
After digging a little into mc
and running mc
with debug logs, I found that the error message is returned by R2 along with a 400 Bad Request:
Request
POST **** HTTP/1.1
Host: ****
User-Agent: MinIO (linux; amd64) minio-go/v7.0.27 mc/RELEASE.2022-06-26T18-51-48Z
Content-Length: 5144
Authorization: AWS4-HMAC-SHA256 Credential=****, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=****
Content-Type: application/octet-stream
X-Amz-Content-Sha256: b0ff115848d1eade087bf77d66c70a30ff93af21b3dfa8ac8dc12d0af9dc74b1
X-Amz-Date: 20220703T101550Z
Accept-Encoding: gzip
Response
HTTP/1.1 400 Bad Request
Content-Length: 149
Cf-Ray: 724edcd00d693607-MAN
Connection: keep-alive
Content-Type: application/xml
Date: Sun, 03 Jul 2022 10:15:53 GMT
Expect-Ct: max-age=604800, report-uri="****"
Server: cloudflare
<Error><Code>MalformedXML</Code><Message>The XML you provided was not well formed or did not validate against our published schema.</Message></Error>
To check if the XML sent as part of the request was malformed or not, I modified mc
to output the XML before sending it to R2. I found the XML that was triggering the 400 was a CompleteMultipartUpload
message. After comparing with the AWS docs for this message, everything seemed to match up. Here is part of the XML:
<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="">
<Part xmlns="">
<PartNumber>1</PartNumber>
<ETag>AC1GJQgYfVqoSXiBAbsceB7AGHhTrGQo5yxb/gKRPa5U8mFjEJ8mY5yXCSWXVCgnSC3htAhdtH4PPIBuk43sZRwyZ2NMI+jdBp277XB6iGWa7gokwzP+5D+qOT3F6yfvoTdlhMCAHlUuZGXLCGMLAaB9nMdUdM40xJenddNWb/PdGI5Ywn5IKKsRFrb4Xwnh85SMp90I7LZxpGuy6J3mqSdYSuqhPh4MndcnHDd4aS6EXmm9lfm2hnISFXgz4Pngog==</ETag>
</Part>
<Part xmlns="">
<PartNumber>2</PartNumber>
<ETag>APGSqyqEDKN2bG68x/4CykDaifv9wit9RVwXTvC10MDGu+6gXviP4ws1I/SOqb8unzH0WM3QaELjbKshCTw02HA5N6T7vjQ/bsHwAon1dfC9Q4t7S/fis/LTjlY55IIE5fzsSAZPR8BN3T7LdADDiSO7HomrlhTlX5sgajRq0CA1JmP+aqFQENouDpi/U2vvZsrUWxwTxIUMV3pjwdPdJ1xs+V/kU7qT0hCIpO9cItI69ZzrzLx0/M8IdYKlRDYwDw==</ETag>
</Part>
...
</CompleteMultipartUpload>
At this point, I decided to try uploading the same file using the AWS CLI. This worked as expected. After inspecting the difference between the XML for CompleteMultipartUpload
sent by mc
and aws s3
, I noticed that the text body of ETag
was quoted by aws s3
and left unquoted by mc
.
To make sure this was the problem, I modified mc
to quote the body of ETag
and tada, it worked! Here is the XML that worked:
<?xml version="1.0"?>
<CompleteMultipartUpload xmlns="">
<Part xmlns="">
<PartNumber>1</PartNumber>
<ETag>"AOCRehBvt+Esfw9HEvgxgFDspcj1uubiJbAqFcY5u6iv0qBikr57n0Qs7FvHR10LBaoIk7VO4y9GCw1GtbaOUgaT+aViejBuJf04w3vOBq0doF3H0SeZFYHkAqWhcX7QUGiwAc+k8R9nP+WVMXSiXGhzIh1jUX/oRNOfp5cQt951JFxuu9G/ft28dajy7HOb7DXoSBSiwqT6hscj3mqQ3N1XvnuHS71zEwAhnxSpAc392hGr6SH6U+41ssuUeM7bKw=="</ETag>
</Part>
<Part xmlns="">
<PartNumber>2</PartNumber>
<ETag>"AMomqON9e7lrEVVHugMeVOLX74XWpe/CIKabpT8Xbg34iYWfPXUNI/Dz2OfPiswj5E/ZrdCSmGg1nH1lrZKy1mrWD/LP03enTcKkb14wiV4I5UQ2J2HIYX427fYGsIM6yAWWPasz/Py1xWQmTA7ZLtaGnC6DaMX1dBnuFHk3FfQjxLU3evFmHA/bC3Ci06foa8rm7ygFetoTGq9CEbUBBnFjDVj2sOQSmKfGBuAsxKLPQrK6vrnthLPbbo9cHg1jaQ=="</ETag>
</Part>
...
</CompleteMultipartUpload>
So here is the question: is this something that needs to be changed on the R2 side, it looks like other S3 implementations including S3 itself can handle the ETag
body without needing quotes, mc
works fine with these other providers. On the other hand, I could raise this issue with mc
and maybe get the change for quoting the ETag
body upstreamed, but I feel skeptical about the latter?