1. Backup procedures
    1. Policies
    2. Backup storage
      1. Marcos storage
      2. External
      3. Offsite
      4. Offsite (squirrel mode)
      5. Marcos backup inventory details
    3. Drive replacement
    4. Disaster recovery
      1. Tier 1
    5. VPS providers
      1. Large storage options
    6. Offsite procedures
      1. Remaining work on borg
      2. Remaining work on git-annex
      3. Random git-annex docs
      4. Append-only git repositories
      5. Encrypted remotes
      6. Encrypted repos restore procedure
      7. References


Main server backups are automatic, nightly. Offsite backups are by hand, monthly.

Workstation and laptop backups are more irregular, on a separate drive.

Most backups are performed with borg but some offsite backups are still done with bup for historical reasons but may be migrated to another storage system, see below for progress.

Backup storage

I have about 30TB of storage deployed in various places, quite ineffeciently managing a little over 5TB of original data stored in various places. The main reason for that inefficiency is that many drives outlived their usefulness because they are too small and no "enterprise" storage mechanisms (like RAID) were deployed to aggregate multiple drives.

Such bad usage pattern could (eventually?) be fixed by regrouping all those drives in a single cohesive unit, as a NAS for example. See marcos for a discussion of alternatives.

Marcos storage



Offsite (squirrel mode)

Those are archives that were disseminated in different locations.

Marcos backup inventory details

This is out of date.

path backup location notes
/ borg on calyx
/var borg on calyx
/usr borg on calyx
/home borg on calyx
/srv no see below
/srv/archive/ bup-srv on calyx one time only
/srv/audiobooks/ git-annex on green
/srv/auto/ no transient data
/srv/backup/ bup-srv on calyx one time only
/srv/books/ git-annex on green
/srv/books-incoming/ no transient data
/srv/conference/ no local copy of public data
/srv/espresso/ git-annex on markov
/srv/incoming/ bup-srv on calyx one time only
/srv/karaoke/ bup-srv on calyx one time only
/srv/mp3/ git-annex on VHS also markov, angela, archive0
/srv/playlists/ bup-srv on calyx one time only
/srv/podcast/ no todo?
/srv/roms/ git-annex on green
/srv/sid/ bup-srv on calyx one time only
/srv/SteamLibrary/ bup-srv on calyx one time only
/srv/tahoe/ no redundant data, by definition, unusable without key
/srv/tempete/ bup-srv on calyx one time only
/srv/tftp/ git-annex not sync'd to green, but files are publicly available, and git repo copied over at koumbit
/srv/video/ git-annex on green

Drive replacement

This procedure describes a major disk replacement on a system with LUKS encryption and LVM, but without RAID-1 (which would be obviously much easier). It is specific to my setup but could be useful to others and is aimed at technical users familiar with the commandline.

  1. create parts with parted, mark a 8MB leading part with the bios_grub flag:

     parted /dev/sdc mklabel gpt
     parted -a optimal /dev/sdc mkpart primary 0% 8MB
     parted -a optimal /dev/sdc mkpart primary 8MB 100%

    Marcos partitions are currently:

     $ sudo lvdisplay -C
     LV   VG        Attr       LSize
     home marcossd1 -wi-ao---- 380,00g
     root marcossd1 -wi-ao----  10,00g
     swap marcossd1 -wi-ao----   4,00g
     usr  marcossd1 -wi-ao----  20,00g
     var  marcossd1 -wi-ao----  30,00g

  2. initialise crypt partition:

    cryptsetup -v --verify-passphrase luksFormat /dev/sdX3
    cryptsetup luksOpen /dev/sdX3 crucial_crypt

    Note that newer versions of Debian (e.g. stretch and later) have good settings so you do not need to choose cipher settings and so on. But on older machines, you may want something like:

    --cipher aes-xts-plain64 --key-size 512 --hash sha256 --iter-time 5000

    I was also recommending --use-random here but I believe it is not necessary anymore.

  3. initialize logical volumes

    pvcreate /dev/mapper/crucial_crypt
    vgcreate marcossd1 /dev/mapper/crucial_crypt

    repeat for every filesystem, use vgdisplay -C and lvdisplay -C to inspect existing sizes:

    lvcreate -L10G -n root marcossd1
    mkfs /dev/mapper/marcoss1-root
    # [...]

  4. basic filesystem setup:

    mount /dev/mapper/marcossd1-root /mnt
    mkdir /mnt/{dev,sys,proc,boot,usr,var,home,srv}

  5. restore the root filesystem:

    cd /mnt
    borg extract -e boot -e usr -e var -e home --progress /media/sdc2/borg::marcos-2017-06-19

    note that --progress is available only in newer versions of borg (1.1 and later).

    if borg is not available for some reason, the filesystem can also be synchronized directly:

    rsync -vaHAx --inplace --delete --one-file-system / /mnt/

    note that this will destroy the mountpoint directories like /mnt/usr, which need to be recreated.

  6. edit /mnt/etc/fstab (and keep a copy in /etc/fstab.new) to change the VG paths and the /boot UUID (which can be found with blkid /dev/sdX2

  7. mount all filesystems:

    mount -o bind /dev /mnt/dev
    chroot /mnt
    mount -a
    mount -t sysfs sys /sys

  8. change /mnt/etc/crypttab (make a copy in /etc/crypttab.new) to follow the new partition names:

    • make sure you have NO TYPO in the new line
    • use blkid to get the UUID of the crypto device, e.g. blkid /dev/sdX3

  9. restore everything from backups:

    cd /mnt
    borg extract --progress /media/sdc2/borg::marcos-auto-2017-06-19
    borg extract --progress /media/sdc2/borg::marcos-logs-2017-11-28

    or rsync from the live filesystem (see below).

  10. go to single user mode:

    shutdown now

  11. sync from the live filesystem again, using /home/anarcat/bin/backup-rsync-mnt - a bunch of rsync for each partition, basically:

    rsync -vaHAx --inplace --delete /usr/ /mnt/usr/

  12. install boot blocks

    chroot /mnt
    mv /etc/fstab.new /etc/fstab
    mv /etc/crypttab.new /etc/crypttab
    echo "search.fs_uuid c7bf0134-d9bf-4506-b859-3d19e9a333c1 root" >> /boot/grub/load.cfg
    update-initramfs -u -k all
    grub-install /dev/sdX

    the fs.uuid flag comes from the /boot device, and can be found with the blkid command as well.

  13. reboot and pray

See also 2019-02-25-new-large-disk-8-year-old-anniversary for another hard drive configuration procedure.

Disaster recovery

backup plan if all else fails

  1. GTFO with the backup drives, and at least password manager (laptop/workstation rip out)

  2. confirm Gandi, park domains on a "Gandi Site" (free, one page)

  3. setup one VPS to restore DNS service, secondary at Gandi

  4. setup second VPS to restore tier-1 services

  5. restore other services as necessary

Tier 1

DNS: setup 3 primary zones and glue records.

Email: install dovecot + postfix, setup aliases and delivery. Restore mailboxes.

Web: install apache2 + restore wiki.

VPS providers

Large storage options

This was done as part of research for archival in virtual machines.




Offsite procedures

A new offsite backup system was created. Previously, it was a manual process: bring the drives back to the server, pop them in a SATA enclosure, start the backup script by hand, wait, return the drives to the offsite location. This "pull" configuration had the advantage of being resilient against an attacker wanting to destroy all data, but the manual process meant the backups were never done as often as they should have.

A new design based on borg and git-annex assumes a remote server online that receives the backups (a "push" configuration). The goal is to setup the backup in "append-only" mode so that an attacker is limited in its capacity to destroy stuff on the server.

A first sync was done locally to bootstrap the dataset. This was harder than expected because the external enclosure had an older SATA controller that didn't support the 8TB drive (it was detected as 2TB) so I had to connect it in my workstation instead (an Intel NUC, which meant a tangled mess).

All this needs to be documented better and merged with the above documentation.

Remaining work on borg

  1. decice what to do with /var/log (currently excluded because we want lower retention on those)

  2. prune policies, skipped for now because incompatible with append-only

  3. automate crypto:

    a. change passphrase a. include it in script here a. include a GnuPG symmetric encrypted copy of the pass on the offsite disk

    Note: this approach should work, but needs a full shell when the key is changed, so it is fundamentally incompatible with restricted shell provider

  4. set append-only mode and restricted shell by allowing only the right borg command to be called, in authorized_keys:

    command="borg serve --append-only",restrict ssh-rsa AAAAB...
  5. test full run again

  6. document this in the borg documentation itself or at least here

Remaining work on git-annex

  1. switch git-annex remotes and borg repo to remote server when drive is installed (done)

  2. enable in script sync in script (done)

  3. resync everything again (done)

  4. add Photos repo with git-annex encryption (blocker: error while setting up gcrypt remote, fixed by removing the push.sign option, sent patch to spwhitton, so done)

  5. restricted shell, see git-annex-shell:

    command="GIT_ANNEX_SHELL_LIMITED=true git-annex-shell -c \"$SSH_ORIGINAL_COMMAND\"",restrict ssh-rsa AAAAB3NzaC1y[...] user@example.com

    GIT_ANNEX_SHELL_DIRECTORY would be useful, but we have multiple repositories we want to allow, and that, if I read CmdLine.GitAnnexShell.Checks.checkDirectory correctly, is not pattern-based but an exact match (using equalFilePath). (done, see below)

  6. make repositories made append-only, not currently supported by git-annex (done, see below)

  7. change encryption key for encrypted repositories so they work unattended. the sticky question here is which key to use. a different subkey? or a whole other keypair? if that, then how to deal with expiry, propagation, etc?

  8. setup cronjobs for all repositories (partly done: non-encrypted repositories are part of the manual backup script)

Random git-annex docs

This is how the git-annex repositories were setup at first:

for r in  audiobooks books espresso incoming mp3 playlists podcast roms video; do 
    git init /mnt/$r
    git -C /srv/$r remote add offsite /mnt/$r
    git -C /srv/$r annex sync
    git -C /srv/$r annex wanted offsite standard
    git -C /srv/$r annex group offsite backup
    git -C /srv/$r annex sync --content

Append-only git repositories

On the server, for each repo, disable destructive pushes:

git config receive.denyDeletes true
git config receive.denyNonFastForwards true

And force git-annex to be used for that key, in ~/.ssh/authorized_keys:

command="GIT_ANNEX_SHELL_APPENDONLY=true git-annex-shell -c \"$SSH_ORIGINAL_COMMAND\"",restrict ssh-rsa AAAAB3NzaC1y[...] user@example.com

This only works with git-annex 6.20180529 or later.

Then, on the client, generate a key for this purpose:

ssh-keygen -f ~/.ssh/id_rsa.git-annex

Then, in each repo, configure the key:

git config core.sshCommand "ssh -i /home/anarcat/.ssh/id_rsa.git-annex -o IdentitiesOnly=yes"

Unfortunately, because git-annex does not respect the core.sshCommand configuration in git, we need to use a special remote configured in ~/.ssh/config as such:

Host backup-annex
    # special key for git-annex
    IdentitiesOnly yes
    IdentityFile ~/.ssh/id_rsa.git-annex

And then change the remote:

git remote set-url offsite backup-annex:/srv/offsite/foo/

Then a cronjob (or the assistant, but i chose the former) can be ran to sync changes automatically:

for r in audiobooks books espresso roms mp3 incoming video; do
    echo "syncing $r"
    git -C /srv/$r annex sync --content -J2 

The problem here is that --quiet is not completely quiet:

$ LANG=C.UTF-8 git annex sync --content --quiet 
On branch master
nothing to commit, working tree clean

git-annex should pass --quiet down to git commit...

Another problem is that this only works for regular git remotes. This will fail on encrypted remotes, which rely on rsync. A workaround I found was to rely on a feature of the git-shell command, which git-annex-shell calls unless GIT_ANNEX_SHELL_LIMITED is set. That feature allows you to write custom wrappers that get called by git when an unknown command is sent. I wrote this wrapper in ~/git-shell-commands/rsync:


for i; do
    case $i in
        logger <<EOF
disallowed rsync argument: '$i'
        exit 1

logger <<EOF
passthrough rsync command: '$@'

exec rsync "$@"

This allows only certain rsync commands to go through, namely the normal rsync arguments passed by git-annex, but also only paths along a restricted pattern. Yes, this means an attacker can overwrite any git repository it choses, but it needs to be a bare git repository (/*.git/*), as there is no way to use gcrypt with append-only repositories, unfortunately.

Encrypted remotes

To setup the encrypted for pictures remotes, first the git-annex objects:

Photos$ git annex initremote offsite-annex type=rsync rsyncurl=user@example.net:/srv/Photos.annex/ encryption=hybrid keyid=8DC901CE64146C048AD50FBB792152527B75921E
Photos$ git annex sync --content offsite-annex

Then the git objects themselves:

Photos$ git remote add offsite-git gcrypt::rsync://user@example.net:/srv/Photos.git/
Photos$ git annex sync offsite-git

It is still unclear to me why those need to be separate. I first tried as a single repo with encryption, as documented on the website but it turns out this has significant performance problems, e.g. gcrypt remote: every sync uploads huge manifest. So spwhitton suggested the above approach of splitting the repositories in two.

What I don't understand is why git-annex can't simply encrypt the blobs and pass them down its regular remote structures like bare git repositories. Using rsync creates unnecessary overhead and complex URLs. The user interface on transfers is also far from intuitive:

$ git annex sync --content offsite-annex
Sur la branche master
rien à valider, la copie de travail est propre
copy 1969/12/31/20120415_009.mp4 (checking offsite-annex...) (to offsite-annex...)
sending incremental file list
      1,289,927 100%  199.82MB/s    0:00:00 (xfr#1, to-chk=0/5)

Parallel transfers also don't have progress information. It really feels like encryption is a second-class citizen here. I also feel it will be rather difficult to reconstruct this repository from scratch, and an attempt will need to be made before we feel confident of our restore capacities. Yet the restore was tested below and seems to work so we're going ahead with the approach.

Encrypted repos restore procedure

To access the files in a bare-metal restore, first the OpenPGP keyring needs to be extracted from somewhere of course. Then a blank git repository is created:

git init Photos

And the first git remote is added and fetched:

git remote add origin gcrypt::rsync://user@example.net:/srv/Photos.git/
git fetch origin
git merge origin/master

Then the object store is added and fetched:

git annex enableremote offsite-annex type=rsync rsyncurl=user@example.net:/srv/Photos.annex/ encryption=hybrid keyid=8DC901CE64146C048AD50FBB792152527B75921E
git annex get --from offsite-annex

The first line is critical: initremote might create a new encryption key instead of reusing the existing one?



Once we figure out git-annex, the following pages need to be updated:

Created . Edited .