1. Installation
    1. Example installations
    2. Issues
      1. Swap
      2. Native encryption
    3. TRIM
  2. Information
  3. Mounts
    1. Mounting
    2. Alternate mountpoints
    3. Encrypted datasets
    4. Deprecated: zfsutil
    5. Cool hack: moving data into ZFS easily
  4. Snapshots
    1. Automated snapshots
    2. Sanoid/syncoid alternatives
      1. zrepl
      2. zfs-auto-snapshot
      3. simplesnap
  5. Caveats
    1. Empty datasets
    2. Not mainline
  6. Other documentation
    1. ZFS documentation

Installation

Example installations

Issues

Swap

Swap on ZFS volumes (AKA "swap on ZVOL") can trigger lockups and that issue is still not fixed upstream. Ubuntu recommends using a separate partition for swap instead. cks would rather have no swap that swap on ZFS and compares it to NFS...

curie was setup without a swap partition (or, at least, hoping to use a ZFS dataset as a swap backend) but this has proven to be generally a bad idea. Were we to setup a new ZFS system, we'd use LUKS encryption and setup a dedicated swap partition, as we had problems with ZFS encryption as well.

Native encryption

ZFS supports native encryption, but there are serious caveats with it.

I've had trouble moving encrypted datasets between pools when trying to move the tubman rpool from HDDs to SSDs. This is a problem many people are facing, without good solutions, see also this truenas discussion, this reddit thread, this openzfs docs thread, and this other one.

Also, native encryption "will not encrypt metadata related to the pool structure, including dataset and snapshot names, dataset hierarchy, properties, file size, file holes, and deduplication tables (though the deduplicated data itself is encrypted)." So it will leak some metadata about the filesystem. Deduplication is limited to the dataset level.

Therefore, it might be better to use LUKS encryption underneath ZFS to configure fully encrypted systems, although I haven't tested this directly.

TRIM

I enabled (a little late) TRIM on the SSD pools:

zfs set org.debian:periodic-trim=enable bpoolssd
zfs set org.debian:periodic-trim=enable rpoolssd

That will setup periodic TRIMs, but it's also possible to set the equivalent of "discard" that "looks for space which has been recently freed, and is no longer allocated by the pool, to be periodically trimmed, however it does not immediately reclaim blocks after a free, which makes it very effective at a cost of more likely of encountering tiny ranges."

zpool set autotrim=on bpoolssd
zpool set autotrim=on rpoolssd

You can do a manual trim with:

zpool trim bpoolssd
zpool trim rpoolssd

Here's an example run:

root@tubman:/etc# zpool status -t rpoolssd
  pool: rpoolssd
 state: ONLINE
  scan: scrub repaired 0B in 00:00:37 with 0 errors on Sun Nov 13 00:24:38 2022
config:

    NAME        STATE     READ WRITE CKSUM
    rpoolssd    ONLINE       0     0     0
     mirror-0  ONLINE       0     0     0
       sdb4    ONLINE       0     0     0  (untrimmed)
       sdd4    ONLINE       0     0     0  (untrimmed)

errors: No known data errors
root@tubman:/etc# zpool trim rpoolssd
root@tubman:/etc# zpool status -t rpoolssd
  pool: rpoolssd
 state: ONLINE
  scan: scrub repaired 0B in 00:00:37 with 0 errors on Sun Nov 13 00:24:38 2022
config:

    NAME        STATE     READ WRITE CKSUM
    rpoolssd    ONLINE       0     0     0
     mirror-0  ONLINE       0     0     0
       sdb4    ONLINE       0     0     0  (3% trimmed, started at Wed 16 Nov 2022 12:19:04 PM EST)
       sdd4    ONLINE       0     0     0  (3% trimmed, started at Wed 16 Nov 2022 12:19:04 PM EST)

errors: No known data errors

See also the TRIM documentation in the Debian wiki.

Information

Listing partitions and snapshots:

zfs list

IO statistics, every second:

zpool iostat 1

Mounts

Mounting

After a zfs list, you should see the datasets you can mount. You can mount one by name, for example with:

zfs mount bpool/ROOT/debian

Alternate mountpoints

Note that it will mount the device in its pre-defined mountpoint property. In the above, it was /boot. If you want to change its mountpoint, it can be done on the fly with:

zfs set -o mountpoint=/mnt/boot bpool/ROOT/debian

If the dataset is already mounted, it will be moved to that new location immediately. Note that the parent pool's altroot property affects this path, as it's pre-pended to the mountpoint. See zpoolprops(8) for details.

If you are dealing with a new pool that's not yet known to ZFS (e.g. you just added a new drive), you will first need to import it. Typically, you'd also want to do that in an altroot, so that it doesn't override existing mounts, like this:

zpool import POOLNAME -R /mnt

This would import all pools ZFS can find:

zpool import -a -R /mnt

Encrypted datasets

If the dataset is encrypted, however, you first need to unlock it with:

zpool import -l -a

Deprecated: zfsutil

This is another way to use an alternate mountpoint, although I'm less certain it's a good way anymore:

mount -o zfsutil -t zfs bpool/BOOT/debian /mnt

Cool hack: moving data into ZFS easily

I used this procedure to move /srv/sbuild/qemu from a spinning rust drive (BTRFS, on curie) to a ZFS dataset running over NVMe. With other filesystems, this would have required either creating a new logical volumes or hacking around bind mounts. With ZFS, this was the procedure:

zfs create -o mountpoint=none -o canmount=off rpool/srv
zfs create -o mountpoint=/mnt/sbuild rpool/srv/sbuild
mv /srv/sbuild/* /mnt/sbuild/
zfs set mountpoint=/srv/sbuild rpool/srv/sbuild

That's it! You can graft mountpoints like this anywhere, which is powerful and scary!

Snapshots

Creating:

zfs snapshot pool/volume@LABEL

Listing:

zfs list -t snapshot

Listing with creation date:

zfs list -t snapshot -o name,creation

Rollback:

zfs rollback pool/volume@LABEL

Destroy:

zfs destroy pool/volume@LABEL

Limiting the number of snapshots:

zfs set snapshot_limit=2 rpool/var/cache

This is useful if you automate snapshot creation (like, say, with sanoid) and you have filesystems that have ridiculous disk usage because of old, useless snapshots.

Automated snapshots

Automatic snapshots we configured with sanoid, see the Puppet code and configuration file).

Sanoid/syncoid alternatives

TODO: we're considering alternatives to sanoid/syncoid.

After reading the code to implement a --dryrun argument on syncoid, I have found the code to have some issues. There's large functions, lots of system calls without arrays... It feels a little messy, and hard to audit, review, or work on..

zrepl

zrepl is an interesting alternative. It claims support for native encryption, bandwidth limiting, pull/push, Prometheus monitoring with a provided Grafana dashboard. It's written in Golang, and is not packaged in Debian.

There's an issue and discussion that gives a rough idea of how it differs from sanoid. There's this ticket open for a migration guide.

It has no dry run mode.

zfs-auto-snapshot

The zfs-auto-snapshot upstream is possibly dead, or at least looking for volunteers, so probably not an option.

simplesnap

Goerzen's simplesnap is another option. It's a pair of fairly short shell scripts (~600 lines total) that send snapshots to a backup host. It's unclear if it supports encryption any better than other tools, fairly minimalist.

Packaged in Debian.

Caveats

Empty datasets

You can sometimes end up with odd situations when mounting datasets. In the tubman install, I ended up in a situation where /var was a valid dataset, but it had canmount=off so it wasn't actually used.

This meant that the data in /var was actually in the rpool/ROOT/debian dataset, mounted on /. I mistakenly reset the canmount flag to on which shadowed that mountpoint, and basically emptied /var.

There's also some evidence that having a mountpoint for a ZFS dataset will cause it to shadow the actual dataset, which is the reverse of what one would normally expect from a filesystem. According to this discussion:

Yes you need to delete the directory -- if it exists, it cannot be mounted there.

In other words, if you have a directory called /mnt/foo and you have a dataset pool/foo configured to mount on /mnt/foo:

zfs mount pool/foo

will show /mnt/foo empty, because the /mnt/foo directory will shadow the dataset. The solution is to unmount the dataset, remove (or rename, if not empty) the directory, and remount the dataset:

zfs umount pool/foo
rmdir /mnt/foo || mv /mnt/foo /mnt/foo.bak
zfs mount pool/foo

Not mainline

ZFS is still not mainline, and will likely never be.

It should be possible, however, to ship Debian binary packages for ZFS. It's apparently possible to directly build a package with this magic command:

dkms mkbmdeb zfs/2.0.3

See also this idea in grml and this packaging attempt.

Also note that Ubuntu actually ships binary packages for ZFS and are questioning the incompatibility claims.

Other documentation

ZFS documentation

Created . Edited .