Force rsync to use delta transfer to fix corrupt remote file

We host our Korora Project ISO images on SourceForge and I (naturally) use rsync to move them there (slowly, at 100kb/sec). Sometimes though the connection drops off and that’s OK because rsync picks up where it left off.

However occasionally the ISO ends up with the wrong checksum, so something went wrong in the transfer. No amount of re-rsyncing seems to fix this up because by default it uses file size and timestamps to check whether it should skip existing files.

Fortunately, I don’t need to re-send the whole file again as rsync can perform a delta transfer instead and only send the small difference. Yay!

The way I do this is by passing a combination of options to rsync, such as –checksum (to enable transfer of the file), –in-place (to transfer the file in place as rsync normally writes a temporary file, then moves) and –no-whole-file (which tells rsync to not copy the whole file, but use deltas instead).

This becomes something like:
rsync -Pa --checksum --inplace --no-whole-file local.file remote.server:

Here’s a real example:
chris@x220 ~ $ rsync -Pa --checksum --inplace --no-whole-file -e ssh korora-20-i386-cinnamon-live.iso csmart,kororaproject@frs.sourceforge.net:/home/frs/project/k/ko/kororaproject/20/
 
sending incremental file list
korora-20-i386-cinnamon-live.iso
  1,715,470,336 100% 220.87MB/s 0:00:07 (xfr#1, to-chk=0/1)

So rsync just saved me 4 hours of uploading the ISO again. Thanks rsync.

7 thoughts on “Force rsync to use delta transfer to fix corrupt remote file

  1. Interesting idea, although I’m not sure how useful it would be in reality (I don’t know the intricacies of rsync deltra algorithm though). If rsync is conducting a transfer and misses packets from the stream, how does it know? Will it just write data out of order, or does each segment carry an offset in the file? If rsync realises a chunk is missing, can it go back and re-request it in the same session, or is it going to have to carry on with the stream, then you re-run the transfer to pick up the few you missed? 🙂

  2. How much time it take? For example if I have 4 GB database and I add 3 MB in it so how much time it take to add 3 MB between sql file.

  3. Sorry for delay in replying, it will take a while because it has to do checksumming on chunks at each end.. but faster than re-sending from scratch.

  4. it looks to me that –checksum is a way to select or not file to sync, not to tell rsync to use chunks during a sync. rsync use chunk in that case because of –no-whole-file

    thats my understanding.

  5. Hi dominix, yeah, you’re right in that it’s the –no-whole-file option that tells rsync to chunk up the file and the –checksum is an option to determine whether rsync should copy a file or not (instead of just relying on size and time stamps). I didn’t mean to imply that –checksum was doing the chunking, I’ll update the post. Thanks!

  6. Thank you. I fixed partially burnend file on bdrom udf disk without significant lose in it’s capacity. (It has same size, but with zeros in it’s tail).
    All glory to M$ for automated reboot to finish updates – without any cancel or even delay options. Windows Updates über Alles.

Leave a Reply

Your email address will not be published. Required fields are marked *