Jérôme Belleman
Home  •  Tools  •  Posts  •  Talks  •  Travels  •  Graphics  •  About Me

Downloading in Parallel with Curl

24 Aug 2007

If you land on a server that has no download tools supporting simultaneous connections, you can still achieve the same with multiple curls and byte ranges.

1 Downloading Different Ranges in Parallel

Many servers will let you get the necessary information about the file you're trying to download to do so using byte ranges and to resume partial downloads.

Suppose you have a 10-bytes file to download (and that it's mind-bogglingly big). To make it all go faster, the idea is to run simultaneously different instances of curl to download different parts of the file at the same time:

curl -o part1 http://path/to/file -r 0-3
curl -o part2 http://path/to/file -r 4-7
curl -o part3 http://path/to/file -r 8-9

Thus, you're in the process of downloading three parts of file such that:

2 Resuming Interrupted Downloads

But sometimes, you need to interrupt some of the downloads; suppose the part2 download was stopped before the range was completely retrieved. To resume the job, you need to get the exact size downloaded so far, just by checking how much you've already got on disk:

du -b part2

Suppose du tells us part2 weighs 2 bytes. Add this size to the byte offset (i.e. 4 bytes) and add the two: 2+4=6. This is it: you want to resume your job by downloading the part2 range 6-7. You also want to append the remainder of the download to the existing part2 file so you'll use option -C - (you could've said -C 6 but don't bother, you told the -r option already so):

curl -o part2 -C - -r 6-7 http://path/to/file

3 Stitching the File Parts Together

Eventually, stitch all the files together, which is simply carried out with:

cat part1 part2 part3 > file

4 References