Hi all,
I’m trying to read a dataset in r2 from databricks, but encountering an issue (full log below). Seems to be something related to incompatibility between R2 and S3. Has anyone encountered this and/or know of a solution? Thanks
AWSBadRequestException: listStatus on s3a://indexed-xyz/ethereum/decoded/logs/v1.2.0/partition_key=ff/dt=2023: com.amazonaws.services.s3.model.AmazonS3Exception: MaxKeys params must be positive integer <= 1000.; request: GET https://indexed-xyz.ed5d915e0259fcddb2ab1ce5592040c3.r2.cloudflarestorage.com {key=[ethereum/decoded/logs/v1.2.0/partition_key=ff/dt=2023/], key=[false], key=[5000], key=[2], key=[/]} Hadoop 3.3.4, aws-sdk-java/1.12.189 Linux/5.10.147+ OpenJDK_64-Bit_Server_VM/25.345-b01 java/1.8.0_345 scala/2.12.14 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.ListObjectsV2Request; Request ID: null, Extended Request ID: null, Cloud Provider: GCP, Instance ID: unknown (Service: Amazon S3; Status Code: 400; Error Code: InvalidMaxKeys; Request ID: null; S3 Extended Request ID: null; Proxy: null), S3 Extended Request ID: null:InvalidMaxKeys: MaxKeys params must be positive integer <= 1000.; request: GET https://indexed-xyz.ed5d915e0259fcddb2ab1ce5592040c3.r2.cloudflarestorage.com {key=[ethereum/decoded/logs/v1.2.0/partition_key=ff/dt=2023/], key=[false], key=[5000], key=[2], key=[/]} Hadoop 3.3.4, aws-sdk-java/1.12.189 Linux/5.10.147+ OpenJDK_64-Bit_Server_VM/25.345-b01 java/1.8.0_345 scala/2.12.14 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.ListObjectsV2Request; Request ID: null, Extended Request ID: null, Cloud Provider: GCP, Instance ID: unknown (Service: Amazon S3; Status Code: 400; Error Code: InvalidMaxKeys; Request ID: null; S3 Extended Request ID: null; Proxy: null)
Caused by: AmazonS3Exception: MaxKeys params must be positive integer <= 1000.; request: GET https://indexed-xyz.ed5d915e0259fcddb2ab1ce5592040c3.r2.cloudflarestorage.com {key=[ethereum/decoded/logs/v1.2.0/partition_key=ff/dt=2023/], key=[false], key=[5000], key=[2], key=[/]} Hadoop 3.3.4, aws-sdk-java/1.12.189 Linux/5.10.147+ OpenJDK_64-Bit_Server_VM/25.345-b01 java/1.8.0_345 scala/2.12.14 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.ListObjectsV2Request; Request ID: null, Extended Request ID: null, Cloud Provider: GCP, Instance ID: unknown (Service: Amazon S3; Status Code: 400; Error Code: InvalidMaxKeys; Request ID: null; S3 Extended Request ID: null; Proxy: null)