Tag Archive for 'server'

Btrfs RAID 6 on dm-crypt on Fedora (post updated)

Update 2016-08-26: A nasty bug was found in the RAID5/6 Btrfs parity calculation, so I recommend using RAID 10 for now. Where I use raid6 below you may want to change this to raid10. See this post for how to migrate to RAID 10.

I’m building a NAS and given the spare drives I have at the moment, thought I’d have a play with Btrfs. Apparently RAID 6 is relatively safe now (update: turns out, it’s not), so why not put it through its paces? As Btrfs doesn’t support encryption, I will need to build it on top of dm-crypt.

Boot drive:

  • /dev/sda

Data drives:

  • /dev/sdb
  • /dev/sdc
  • /dev/sdd
  • /dev/sde
  • /dev/sdf

I installed Fedora 24 Server onto /dev/sda and just went from there, opening a root shell.

# Install the btrfs and crypt packages (if not already there) so that this will actually work.
dnf install -y btrfs-progs cryptsetup

WARNING WARNING WARNING
The following cryptsetup commands will wipe any drives you specify below. Please make sure you are specifying the correct drives.

# Setup dm-crypt on each data drive
# and populate the crypttab file.
for x in b c d e f ; do
  cryptsetup luksFormat /dev/sd${x}
  UUID="$(cryptsetup luksUUID /dev/sd${x})"
  echo "luks-${UUID} UUID=${UUID} none" >> /etc/crypttab
done
 
# Rebuild the initial ramdisk with crypt support
echo "add_dracutmodules+=crypt" >> /etc/dracut.conf.d/crypt.conf
dracut -fv
 
# Verify that it now has my crypttab
lsinitrd /boot/initramfs-$(uname -r).img |grep crypttab
 
# Reboot and verify initramfs prompts to unlock the devices
reboot
 
# After boot, verify devices exist
ls -l /dev/mapper/luks*

OK, so now I have a bunch of encrypted disks, it’s time to put btrfs into action (note the label, btrfs_data):
# Get LUKS UUIDs and create btrfs raid filesystem
for x in b c d e f ; do
  DEVICES="${DEVICES} $(cryptsetup luksUUID /dev/sd${x}\
    |sed 's|^|/dev/mapper/luks-|g')"
done
mkfs.btrfs -L btrfs_data -m raid6 -d raid6 ${DEVICES}

See all our current btrfs volumes:
btrfs fi show

Get the UUID of the filesystem so that we can create an entry in fstab, using the label we created before:
UUID=$(btrfs fi show btrfs_data |grep uuid |awk '{print $4}')
echo "UUID=${UUID} /mnt/btrfs_data btrfs noatime,subvolid=0 0 0"\
  >> /etc/fstab

Now, let’s create the mountpoint and mount the device:
mkdir /mnt/btrfs_data
mount -a

Check data usage:
btrfs filesystem df /mnt/btrfs_data/

This has mounted the root of the filesystem to /mnt/btrfs_data, however we can also create subvolumes. Let’s create one called “share” for shared network data:
btrfs subvolume create /mnt/btrfs_data/share

You can mount this specific volume directly, let’s add it to fstab:
echo "UUID=${UUID} /mnt/btrfs_share btrfs noatime,subvol=share 0 0"\
  >> /etc/fstab
mkdir /mnt/btrfs_share
mount /mnt/btrfs_share

You can list subvolumes easily by referencing our mounted Btrfs volume:
btrfs subvolume list -p /mnt/btrfs_data/

If you want to delete a subvolume, first unmount it, then remove it from fstab, delete the Btrfs subvolume and finally remove the mount point.
umount /mnt/btrfs_share
sed -i /btrfs_share/d /etc/fstab
btrfs subvolume delete /mnt/btrfs_data/share
/mnt/btrfs_share

Now I plugged in a few backup drives and started rsyncing a few TB across to the device. It seemed to work well!

There are lots of other things you can play with, like snapshots, compression, defragment, scrub (use checksums to repair corrupt data), rebalance (re-allocates blocks across devices) etc. You can even convert existing file systems with btrfs-convert command, and use rebalance to change the RAID level. Neat!

Then I thought I’d try the rebalance command just to see how that works with a RAID device. Given it’s a large device, I kicked it off and went to do something else. I returned to an unwakeable machine… hard-resetting, journalctl -b -1 told me this sad story:

Nov 14 06:03:42 localhost.localdomain kernel: ------------[ cut here ]------------
Nov 14 06:03:42 localhost.localdomain kernel: kernel BUG at fs/btrfs/extent-tree.c:1833!
Nov 14 06:03:42 localhost.localdomain kernel: invalid opcode: 0000 [#1] SMP
Nov 14 06:03:42 localhost.localdomain kernel: Modules linked in: fuse joydev synaptics_usb uas usb_storage rfcomm cmac nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtab
Nov 14 06:03:42 localhost.localdomain kernel: snd_soc_core snd_hda_codec rfkill snd_compress snd_hda_core snd_pcm_dmaengine ac97_bus snd_hwdep snd_seq snd_seq_device snd_pcm mei_me dw_dmac i2c_designware_platform snd_timer snd_soc_sst_a
Nov 14 06:03:42 localhost.localdomain kernel: CPU: 0 PID: 6274 Comm: btrfs Not tainted 4.2.5-300.fc23.x86_64 #1
Nov 14 06:03:42 localhost.localdomain kernel: Hardware name: Gigabyte Technology Co., Ltd. Z97N-WIFI/Z97N-WIFI, BIOS F5 12/08/2014
Nov 14 06:03:42 localhost.localdomain kernel: task: ffff88006fd69d80 ti: ffff88000e344000 task.ti: ffff88000e344000
Nov 14 06:03:42 localhost.localdomain kernel: RIP: 0010:[] [] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: RSP: 0018:ffff88000e3476a8 EFLAGS: 00010293
Nov 14 06:03:42 localhost.localdomain kernel: RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RDX: ffff880000000000 RSI: 0000000000000001 RDI: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: RBP: ffff88000e347728 R08: 0000000000004000 R09: ffff88000e3475a0
Nov 14 06:03:42 localhost.localdomain kernel: R10: 0000000000000000 R11: 0000000000000002 R12: ffff88021522f000
Nov 14 06:03:42 localhost.localdomain kernel: R13: ffff88013f868480 R14: 0000000000000000 R15: 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: FS: 00007f66268a08c0(0000) GS:ffff88021fa00000(0000) knlGS:0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 06:03:42 localhost.localdomain kernel: CR2: 000055a79c7e6fd0 CR3: 00000000576ce000 CR4: 00000000001406f0
Nov 14 06:03:42 localhost.localdomain kernel: Stack:
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000000000 0000000000000005 0000000000000001 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000000001 ffffffff81200176 0000000000270026 ffffffffa0925d4a
Nov 14 06:03:42 localhost.localdomain kernel: 0000000000002158 00000000a7c0ba4c ffff88021522d800 0000000000000000
Nov 14 06:03:42 localhost.localdomain kernel: Call Trace:
Nov 14 06:03:42 localhost.localdomain kernel: [] ? kmem_cache_alloc+0x1d6/0x210
Nov 14 06:03:42 localhost.localdomain kernel: [] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] __btrfs_inc_extent_ref.isra.52+0xa9/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] __btrfs_run_delayed_refs+0xc84/0x1080 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_run_delayed_refs.part.73+0x74/0x270 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] ? btrfs_release_path+0x2b/0xa0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_run_delayed_refs+0x15/0x20 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_commit_transaction+0x56/0xad0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] prepare_to_merge+0x1fe/0x210 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] relocate_block_group+0x25e/0x6b0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_relocate_block_group+0x1ca/0x2c0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_relocate_chunk.isra.39+0x3e/0xb0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_balance+0x9c4/0xf80 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_ioctl_balance+0x3c4/0x3d0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] btrfs_ioctl+0x541/0x2750 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: [] ? lru_cache_add+0x1c/0x50
Nov 14 06:03:42 localhost.localdomain kernel: [] ? lru_cache_add_active_or_unevictable+0x32/0xd0
Nov 14 06:03:42 localhost.localdomain kernel: [] ? handle_mm_fault+0xc8a/0x17d0
Nov 14 06:03:42 localhost.localdomain kernel: [] ? cp_new_stat+0xb3/0x190
Nov 14 06:03:42 localhost.localdomain kernel: [] do_vfs_ioctl+0x295/0x470
Nov 14 06:03:42 localhost.localdomain kernel: [] ? selinux_file_ioctl+0x4d/0xc0
Nov 14 06:03:42 localhost.localdomain kernel: [] SyS_ioctl+0x79/0x90
Nov 14 06:03:42 localhost.localdomain kernel: [] ? do_page_fault+0x2f/0x80
Nov 14 06:03:42 localhost.localdomain kernel: [] entry_SYSCALL_64_fastpath+0x12/0x71
Nov 14 06:03:42 localhost.localdomain kernel: Code: 10 49 89 d9 48 8b 55 c0 4c 89 7c 24 10 4c 89 f1 4c 89 ee 4c 89 e7 89 44 24 08 48 8b 45 20 48 89 04 24 e8 5d d5 ff ff 31 c0 eb ac <0f> 0b e8 92 b7 76 e0 66 90 0f 1f 44 00 00 55 48 89 e5
Nov 14 06:03:42 localhost.localdomain kernel: RIP [] insert_inline_extent_backref+0xe7/0xf0 [btrfs]
Nov 14 06:03:42 localhost.localdomain kernel: RSP
Nov 14 06:03:42 localhost.localdomain kernel: ---[ end trace 63b75c57d2feac56 ]---

Bummer!

Looks like rebalance has a major bug at the moment. I did a search and others have the same problem, looks like I’m hitting this bug. I’ve reported it on Fedora Bugzilla.

Anyway, so I won’t do a rebalance at the moment, but other than that, btrfs seems pretty neat. I will make sure I keep my backups up-to-date though, just in case…

LTSP 5.2 released, with some impressive features

The Linux Terminal Server Project team have released version 5.2 after two years and almost one thousand commits. It has become one pretty powerful product! I like the ability to “run the whole session remotely or run select applications locally to use specific hardware or advance 3D capabilities.”

Neat.

Script for configuring ClamAV server on Fedora

In short, I’ve written a bash script (available from github) for configuring and removing instances of clamav-server on Fedora. It lets you create and remove individual instances with a specific user and port (if you specify them) and will install the required packages if not already present on the system.

In long, we use Clam AntiVirus as our antivirus protection for Digital Preservation Recorder and talk to it over the default port, 3310.

Installing the clamav-server package under Fedora however, doesn’t actually set up an instance. In fact, it doesn’t copy any system configuration files into place at all. This means that the system is left without any working ClamAV server out of the box.

Under Fedora, ClamAV server is configured on a per user basis. This is actually quite important (unless you run as root) because the daemon needs at minimum read access (and we’ve found also write) on the files/directory being passed for scanning.

The instructions on how to configure it are located under /usr/share/doc/clamav-server-[version]/ but I have taken these instructions and written a bash script to configure all of this for you.

The script is available from github. It can create or remove an individual instance of clamav-server using a specific username and port (if you want to specify them, else it defaults to clamav on port 3310). The script will also install any required packages, if you don’t already have them on the system.

Hopefully this is useful to someone else out there and not just us 🙂 If you find any bugs feel free to let me know.

More jackalope than jaunty

We have some IBM x3650 servers at work with Adaptec 8k ServeRAID controller cards and SAS drives.

For the life of me I can’t get Jaunty to boot on the machines. It installs just fine, but the initial reboot fails to find the root device and drops me to an “ash” shell which doesn’t ever actually appear. The keyboard also doesn’t work.

It doesn’t matter what RAID array I have, whether I’m using LVM or a standard partitioning scheme with an msdos partition table.. it just doesn’t work.

I’ve added aacraid and several other modules to the initramfs, still no joy.

Add the fact that the machine takes 10 minutes to boot each time I want to test a small change and it’s one super frustrating situation.

Oh, and 8.04 LTS works just fine.

One bug which appears to be a grub issue that I don’t have, which hasn’t been touched since April. There’s another about being unable to find the root device that also hasn’t received any love.

If anyone has some suggestions (install Debian?), let me know. The reason I’m using Ubuntu is because we have a local mirror and Jaunty because it’s a virtual machine and KVM is the way I want to go.

Update: Ahh, problem resolved and it was my fault. See comments..