Cloudflare returning gzipped file uncompressed

When I request .tar.gz files from our web site, Cloudflare returns them uncompressed instead of returning the gzipped data. For example:

curl -s -o - https://www.w3.org/XML/Test/xmlts20130923.tar.gz | wc -c
5928960

Compared to the size returned by our origin server:

gerald@lahey:~$ curl -s -o - https://www-origin.w3.org/XML/Test/xmlts20130923.tar.gz | wc -c
641522

Is there an issue with our Cloudflare configuration, or our origin server? The response returned by our origin server looks fine to me.

If this is a known issue when .gz files are served by Cloudflare, is there some config change we can make to cause it to serve these files correctly?

It’s not really an ‘issue’ per-se - Cloudflare performs on-the-fly decompression of content if the client (you) doesn’t support gzip, this saves on bandwidth to origin as gzip is always used even if the client is served it plain.

This scenario is really only an issue with curl or other handmade requests - browsers would always indicate they support gzip with the accept-encoding header.

➜  ~ curl -H 'accept-encoding: gzip' -s -o - https://www.w3.org/XML/Test/xmlts20130923.tar.gz | wc -c
  641522
2 Likes

IIRC Cloudflare transparently decompresses responses that have a Content-Encoding header if the client does not indicate support in the Accept-Encoding request header. If you want Cloudflare to return the GZIP-compressed file to all clients, you will have to remove the Content-Encoding header from the origin response.

3 Likes

Thanks. It is an issue when using tools like wget:

$ wget -q https://www.w3.org/XML/Test/xmlts20130923.tar.gz

$ wc -c < xmlts20130923.tar.gz
5928960

$ gunzip xmlts20130923.tar.gz
gzip: xmlts20130923.tar.gz: not in gzip format

$ file xmlts20130923.tar.gz
xmlts20130923.tar.gz: POSIX tar archive (GNU)

https://www.cyberciti.biz/faq/unix-linux-wget-download-compressed-gzip-headers/

There’s some flags in the above page (similar to the ones for curl) to enable wget to use gzip.

Note that you will have to decompress yourself, wget won’t do it for you :slightly_smiling_face:

wget does the right thing by default when requested from our origin server, and I’d like to preserve that behavior when routed through Cloudflare.

Did you try this?

You can also try making the origin send a no-transform directive in its Cache-Control header for those tar.gz files (Not the others that you want to be transparently compressed for browsers). This should stop Cloudflare from modifying them with transparent compression or decompression.

3 Likes

I will look into that, thanks! Maybe a bug in our Apache config, or Apache’s behavior.

After some internal discussion I updated our origin server to stop returning Content-Encoding for .gz files, and that seems to have resolved the issue. Thanks very much to everyone for your quick and helpful responses!

3 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.