It seems silly to make a blog post about this, but I keep on forgetting the answer to "what if I really want to just transfer EVERYTHING with rsync?". Since the rsync(1) manpage is 28,000 words, I basically never go there to find the answer and instead grep around this wiki and find other instances, which are never quite as good as what I've come up with with the help of my (new) colleague weasel.

The common answer is "just use -av":

rsync -av A/ B/

... but that has a few limitations:

The answer, of course, is instead the very intuitive:

rsync -PaSHAX --numeric-ids --info=progress2 A/ B/

If you don't trust the filesystem time and files sizes, also throw in -c to do a (MD5!?) checksum of the files instead, but that's much slower. (A better hashing algorithm could be SHA-2 or Meow, obviously.)

Those flags mean:

    -P                          same as --partial --progress
    -a, --archive               archive mode; equals -rlptgoD (no -H,-A,-X)
    -S, --sparse                turn sequences of nulls into sparse blocks
    -H, --hard-links            preserve hard links
    -A, --acls                  preserve ACLs (implies -p)
    -X, --xattrs                preserve extended attributes
        --numeric-ids           don't map uid/gid values by user/group name
    -c, --checksum              skip based on checksum, not mod-time & size

Keep in mind that -H is expensive, which is why it's not included in -a by default, as the manpage explains.

Unrolling some of those, this actually means:

    -r, --recursive             recurse into directories
    -l, --links                 copy symlinks as symlinks
    -p, --perms                 preserve permissions
    -t, --times                 preserve modification times
    -g, --group                 preserve group
    -o, --owner                 preserve owner (super-user only)
    -D                          same as --devices --specials
        --partial               keep partially transferred files
        --progress              show progress during transfer

And yes, we need to unroll this again:

        --devices               preserve device files (super-user only)
        --specials              preserve special files

The --numeric-ids parameter is really relevant only when you archive files across servers that might not share the same UID space. This is especially important when restoring from backups because you might be creating /etc/passwd along the way (!).

The last bit, --info=progress2 is not directly documented in the manpage, at least not in the --info section. Strangely, there's some information in the -P flag where it says:

outputs statistics based on the whole transfer, rather than
individual files.

I found this was extremely useful during large transfers because, by default, -P (or, more specifically, --progress) shows progress for each individual file. That's fine if you transfer large files, but for large transfers (with a large number of files), that's much less useful and possibly incredibly noisy. --info=progress2, according to --info=help, does instead:

PROGRESS   Mention 1) per-file progress or 2) total transfer progress

... which I admit is not much clearer.

Note that this is similar to how at least one backup system runs its test suite, against, interestingly, rsync. Indeed, bup uses rsync to check that the files it restores are identical to the original. They use the also super-intuitive -niaHAX (maybe with -c), which I find slightly less intuitive than my ordering, which sounds like "fax"pacha in french.

So there you go. -PaSHAX is now your new best friend. And don't forget the obvious --numeric-ids (and not uids, they talk about groups too) and --info=progress2 (grrr) and maybe --checksum if you're nostalgic about the good old MD5 days.

Notice the trailing slashes at the end of A/ and B/. Those, stupidly, matter to rsync. This is one of the most confusing things about rsync and I have gotten around that problem by always specifying a trailing slash to both arguments, which gives a consistent experience all the time. But, if you want to know all the nasty details, try to figure out this bit:

A trailing slash on the source changes this behavior to avoid creating an additional directory level at the destination. You can think of a trailing / on a source as meaning "copy the contents of this directory" as opposed to "copy the directory by name", but in both cases the attributes of the containing directory are transferred to the containing directory on the destination. In other words, each of the following commands copies the files in the same way, including their setting of the attributes of /dest/foo:

rsync -av /src/foo /dest
rsync -av /src/foo/ /dest/foo

They ommitted, obviously, that this is also identical:

rsync -av /src/foo/ /dest/foo/

At this point, I would understand if you want to throw the "fine manual" out the window and yell like crazy.

update: added -S
On pabs's recommendation, I also added -S, changing the acronym from "fax" (-PHaAX) to "pacha(x)" (-PaSHAX) which still sounds good and is a better mapping to the transliteration...
Comment by anarcat
Created . Edited .