Downloading in Parallel with Curl
If you land on a server that has no download tools supporting simultaneous connections, you can still achieve the same with multiple curls and byte ranges.
1 Downloading Different Ranges in Parallel
Many servers will let you get the necessary information about the file you're trying to download to do so using byte ranges and to resume partial downloads.
Suppose you have a 10-bytes file to download (and that it's mind-bogglingly big). To make it all go faster, the idea is to run simultaneously different instances of curl to download different parts of the file at the same time:
curl -o part1 http://path/to/file -r 0-3
curl -o part2 http://path/to/file -r 4-7
curl -o part3 http://path/to/file -r 8-9
Thus, you're in the process of downloading three parts of file
such that:
part1
will be of range 0 to 3 (4 bytes),part2
will be of range 4 to 7 (4 bytes),part3
will be of range 8 to 9 (2 bytes).
2 Resuming Interrupted Downloads
But sometimes, you need to interrupt some of the downloads; suppose the part2
download was stopped before the range was completely retrieved. To resume the job, you need to get the exact size downloaded so far, just by checking how much you've already got on disk:
du -b part2
Suppose du
tells us part2
weighs 2 bytes. Add this size to the byte offset (i.e. 4 bytes) and add the two: 2+4=6. This is it: you want to resume your job by downloading the part2
range 6-7. You also want to append the remainder of the download to the existing part2
file so you'll use option -C -
(you could've said -C 6
but don't bother, you told the -r
option already so):
curl -o part2 -C - -r 6-7 http://path/to/file
3 Stitching the File Parts Together
Eventually, stitch all the files together, which is simply carried out with:
cat part1 part2 part3 > file