Incomplete downloads

Some of my users are not always receiving complete files when downloading from one of my sites. It has taken me some time to reproduce, but a slower server and/or lower bandwidth seems to help. I can reproduce it using an Amazon EC2 instance of lesser size.

On my own workstation with Gbps it seems to happen on every thousand download or so. That’s a guessed number, I didn’t make a scientific count. But with a small Amazon EC2 instance it seems to happen every 10th-20th download.

The error received if using HTTP 1.1 is:
curl: (18) transfer closed with 37106 bytes remaining to read
A similar error is seen if using HTTP/2, but with another error code.

An example download URL looks like this, and that’s the one I have used for my tests:

curl -v --http1.1 --remote-name --remote-header-name --location --max-redirs -1 ‘https://osm-boundaries.com/Download/Fetch/f2403ddacb75b2fd10b42f6a0bdae299?apiKey=34199dedd36bb3db24679c16584622a5

Don’t worry about the api-key, it’s mine and it can’t do any harm. I’ll just make myself a new one in the future. If the api-key doesn’t work (a gzipped geojson isn’t returned), the key has expired, tell me and I’ll update with a new key.

I have used the following hack as a shell script to reproduce the issue:

rm f2403ddacb75b2fd10b42f6a0bdae299.geojson.gz
while true ; do
echo ‘------------------------------------’
curl -v --http1.1 --remote-name --remote-header-name --location --max-redirs -1 ‘https://osm-boundaries.com/Download/Fetch/f2403ddacb75b2fd10b42f6a0bdae299?apiKey=34199dedd36bb3db24679c16584622a5
EC=$?
echo $EC
if [ $EC -ne 0 ] ; then
break
fi
rm f2403ddacb75b2fd10b42f6a0bdae299.geojson.gz
echo
done

The setup is like this.
Client -> Cloudflare -> Apache reverse proxy -> Nginx with PHP.

The file is sent using PHP, like this:

  header('Content-Description: File Transfer');
  header('Content-Type: application/octet-stream');
  header('Content-Disposition: attachment; filename=' . $outputFilename);
  header('Content-Transfer-Encoding: binary');
  header('Expires: 0');
  header('Cache-Control: must-revalidate');
  header('Cache-Control: private');
  header('Cache-Control: no-transform');
  header('Pragma: no-cache');
  header('Content-Length: ' . filesize($filename));

  readfile($filename);
  exit(0);

Some additional facts.

  • It’s happening with Cloudflare in DEV-mode as well.
  • It seems to almost every time be less than 64 kB missing (I saw just a tiny bit more one single time). The file is supposed to be 2228 kB.
  • It’s not happening if I use Host header osm-boundaries.com and replace the domain in the url with my direct (hidden) IP.
  • Every single time it happens, some of the headers shown by curl is in lower-case. Since they are not always in lower-case I assume Cloudflare us using different versions of their software on different servers.
  • I can not see any errors in my end, which is expected since it works fine with direct fetches from my reverse proxy.
  • It has been reproduced with both curl and wget.
  • Adding a sleep 5 in the above shell script does not help.
  • The file is already rendered on my servers. It exists on disk and is just sent every time. It’s not a busy server at all.

Any suggestions? I suspect an issue with Cloudflare, on the other hand I would be surprised if such issue existed and not everyone else would have noticed already.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.