Btrfs RAID 6 on dm-crypt on Fedora (post updated)

Update 2016-08-26: A nasty bug was found in the RAID5/6 Btrfs parity calculation, so I recommend using RAID 10 for now. Where I use raid6 below you may want to change this to raid10. See this post for how to migrate to RAID 10.

I’m building a NAS and given the spare drives I have at the moment, thought I’d have a play with Btrfs. Apparently RAID 6 is relatively safe now (update: turns out, it’s not), so why not put it through its paces? As Btrfs doesn’t support encryption, I will need to build it on top of dm-crypt.

Boot drive:

  • /dev/sda

Data drives:

  • /dev/sdb
  • /dev/sdc
  • /dev/sdd
  • /dev/sde
  • /dev/sdf

I installed Fedora 24 Server onto /dev/sda and just went from there, opening a root shell.

# Install the btrfs and crypt packages (if not already there) so that this will actually work.
dnf install -y btrfs-progs cryptsetup

WARNING WARNING WARNING
The following cryptsetup commands will wipe any drives you specify below. Please make sure you are specifying the correct drives.

# Setup dm-crypt on each data drive
# and populate the crypttab file.
for x in b c d e f ; do
  cryptsetup luksFormat /dev/sd${x}
  UUID="$(cryptsetup luksUUID /dev/sd${x})"
  echo "luks-${UUID} UUID=${UUID} none" >> /etc/crypttab
done
 
# Rebuild the initial ramdisk with crypt support
echo "add_dracutmodules+=crypt" >> /etc/dracut.conf.d/crypt.conf
dracut -fv
 
# Verify that it now has my crypttab
lsinitrd /boot/initramfs-$(uname -r).img |grep crypttab
 
# Reboot and verify initramfs prompts to unlock the devices
reboot
 
# After boot, verify devices exist
ls -l /dev/mapper/luks*

OK, so now I have a bunch of encrypted disks, it’s time to put btrfs into action (note the label, btrfs_data):
# Get LUKS UUIDs and create btrfs raid filesystem
for x in b c d e f ; do
  DEVICES="${DEVICES} $(cryptsetup luksUUID /dev/sd${x}\
    |sed 's|^|/dev/mapper/luks-|g')"
done
mkfs.btrfs -L btrfs_data -m raid6 -d raid6 ${DEVICES}

See all our current btrfs volumes:
btrfs fi show

Get the UUID of the filesystem so that we can create an entry in fstab, using the label we created before:
UUID=$(btrfs fi show btrfs_data |grep uuid |awk '{print $4}')
echo "UUID=${UUID} /mnt/btrfs_data btrfs noatime,subvolid=0 0 0"\
  >> /etc/fstab

Now, let’s create the mountpoint and mount the device:
mkdir /mnt/btrfs_data
mount -a

Check data usage:
btrfs filesystem df /mnt/btrfs_data/

This has mounted the root of the filesystem to /mnt/btrfs_data, however we can also create subvolumes. Let’s create one called “share” for shared network data:
btrfs subvolume create /mnt/btrfs_data/share

You can mount this specific volume directly, let’s add it to fstab:
echo "UUID=${UUID} /mnt/btrfs_share btrfs noatime,subvol=share 0 0"\
  >> /etc/fstab
mkdir /mnt/btrfs_share
mount /mnt/btrfs_share

You can list subvolumes easily by referencing our mounted Btrfs volume:
btrfs subvolume list -p /mnt/btrfs_data/

If you want to delete a subvolume, first unmount it, then remove it from fstab, delete the Btrfs subvolume and finally remove the mount point.
umount /mnt/btrfs_share
sed -i /btrfs_share/d /etc/fstab
btrfs subvolume delete /mnt/btrfs_data/share
/mnt/btrfs_share

Now I plugged in a few backup drives and started rsyncing a few TB across to the device. It seemed to work well!

There are lots of other things you can play with, like snapshots, compression, defragment, scrub (use checksums to repair corrupt data), rebalance (re-allocates blocks across devices) etc. You can even convert existing file systems with btrfs-convert command, and use rebalance to change the RAID level. Neat!

Then I thought I’d try the rebalance command just to see how that works with a RAID device. Given it’s a large device, I kicked it off and went to do something else. I returned to an unwakeable machine… hard-resetting, journalctl -b -1 told me this sad story:

Nov 14 06:03:42 localhost.localdomain kernel: ------------[ cut here ]------------
Nov 14 06:03:42 localhost.localdomain kernel: kernel BUG at fs/btrfs/extent-tree.c:1833!
Nov 14 06:03:42 localhost.localdomain kernel: invalid opcode: 0000 [#1] SMP
Nov 14 06:03:42 localhost.localdomain kernel: Modules linked in: fuse joydev synaptics_usb uas usb_storage rfcomm cmac nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtab
Nov 14 06:03:42 localhost.localdomain kernel: snd_soc_core snd_hda_codec rfkill snd_compress snd_hda_core snd_pcm_dmaengine ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm mei_me dw_dmac i2c_designware_platform snd_timer snd_soc_sst_a
Nov 14 06:03:42 localhost.localdomain kernel: CPU: 0 PID: 6274 Comm: btrfs Not tainted 4.2.5-300.fc23.x86_64 #1
Nov 14 06:03:42 localhost.localdomain kernel: Hardware name: Gigabyte Technology Co., Ltd. Z97N-WIFI/Z97N-WIFI, BIOS F5 12/08/2014
Nov 14 06:03:42 localhost.localdomain kernel: task: ffff88006fd69d80 ti: ffff88000e344000 task.ti: ffff88000e344000
Nov 14 06:03:42 localhost.localdomain kernel: RIP: 0010:[] [] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: RSP: 0018:ffff88000e3476a8 EFLAGS: 00010293
Nov 14 06:03:42 localhost.localdomain kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RDX: ffff880000000000 RSI: 0000000000000001 RDI: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RBP: ffff88000e347728 R08: 0000000000004000 R09: ffff88000e3475a0
Nov 14 06:03:42 localhost.localdomain kernel: R10: 0000000000000000 R11: 0000000000000002 R12: ffff88021522f000
Nov 14 06:03:42 localhost.localdomain kernel: R13: ffff88013f868480 R14: 0000000000000000 R15: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: FS: 00007f66268a08c0(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 06:03:42 localhost.localdomain kernel: CR2: 000055a79c7e6fd0 CR3: 00000000576ce000 CR4: 00000000001406f0
Nov 14 06:03:42 localhost.localdomain kernel: Stack:
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000000000 0000000000000005 0000000000000001 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000000001 ffffffff81200176 0000000000270026 ffffffffa0925d4a
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000002158 00000000a7c0ba4c ffff88021522d800 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: Call Trace:
Nov 14 06:03:42 localhost.localdomain kernel: [] ? kmem_cache_alloc+0x1d6/0x210
Nov 14 06:03:42 localhost.localdomain kernel: [] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] __btrfs_inc_extent_ref.isra.52+0xa9/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] __btrfs_run_delayed_refs+0xc84/0x1080 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_run_delayed_refs.part.73+0x74/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] ? btrfs_release_path+0x2b/0xa0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_run_delayed_refs+0x15/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_commit_transaction+0x56/0xad0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] prepare_to_merge+0x1fe/0x210 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] relocate_block_group+0x25e/0x6b0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_relocate_block_group+0x1ca/0x2c0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_relocate_chunk.isra.39+0x3e/0xb0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_balance+0x9c4/0xf80 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_ioctl_balance+0x3c4/0x3d0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_ioctl+0x541/0x2750 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] ? lru_cache_add+0x1c/0x50
Nov 14 06:03:42 localhost.localdomain kernel: [] ? lru_cache_add_active_or_unevictable+0x32/0xd0
Nov 14 06:03:42 localhost.localdomain kernel: [] ? handle_mm_fault+0xc8a/0x17d0
Nov 14 06:03:42 localhost.localdomain kernel: [] ? cp_new_stat+0xb3/0x190
Nov 14 06:03:42 localhost.localdomain kernel: [] do_vfs_ioctl+0x295/0x470
Nov 14 06:03:42 localhost.localdomain kernel: [] ? selinux_file_ioctl+0x4d/0xc0
Nov 14 06:03:42 localhost.localdomain kernel: [] SyS_ioctl+0x79/0x90
Nov 14 06:03:42 localhost.localdomain kernel: [] ? do_page_fault+0x2f/0x80
Nov 14 06:03:42 localhost.localdomain kernel: [] entry_SYSCALL_64_fastpath+0x12/0x71
Nov 14 06:03:42 localhost.localdomain kernel: Code: 10 49 89 d9 48 8b 55 c0 4c 89 7c 24 10 4c 89 f1 4c 89 ee 4c 89 e7 89 44 24 08 48 8b 45 20 48 89 04 24 e8 5d d5 ff ff 31 c0 eb ac <0f> 0b e8 92 b7 76 e0 66 90 0f 1f 44 00 00 55 48 89 e5
Nov 14 06:03:42 localhost.localdomain kernel: RIP [] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: RSP
Nov 14 06:03:42 localhost.localdomain kernel: ---[ end trace 63b75c57d2feac56 ]---

Bummer!

Looks like rebalance has a major bug at the moment. I did a search and others have the same problem, looks like I’m hitting this bug. I’ve reported it on Fedora Bugzilla.

Anyway, so I won’t do a rebalance at the moment, but other than that, btrfs seems pretty neat. I will make sure I keep my backups up-to-date though, just in case…

One thought on “Btrfs RAID 6 on dm-crypt on Fedora (post updated)

Leave a Reply

Your email address will not be published. Required fields are marked *