KVM guests with emulated SSD and NVMe drives

Sometimes when you’re using KVM guests to test something, perhaps like a Ceph or OpenStack Swift cluster, it can be useful to have SSD and NVMe drives. I’m not talking about passing physical drives through, but rather emulating them.

NVMe drives

QEMU supports emulating NVMe drives as arguments on the command line, but it’s not yet exposed to tools like virt-manager. This means you can’t just add a new drive of type nvme into your virtual machine XML definition, however you can add those qemu arguments to your XML. This also means that the NVMe drives will not show up as drives in tools like virt-manager, even after you’ve added them with qemu. Still, it’s fun to play with!

QEMU command line args for NVMe

Michael Moese has nicely documented how to do this on his blog. Basically, after creating a disk image (raw or qcow2) you can add the following two arguments like this to the qemu command. I use a numeric drive id and serial so that I can add multiple NVMe drives (just duplicate the lines and increment the number).

-drive file=/path/to/nvme1.img,if=none,id=NVME1 \
-device nvme,drive=NVME1,serial=nvme-1

libvirt XML definition for NVMe

To add NVMe to a libvirt guest, add something like this at the bottom of your virtual machine definition (before the closing </domain> tag) to call those same qemu args.

  <qemu:commandline>
    <qemu:arg value='-drive'/>
    <qemu:arg value='file=/path/to/nvme1.img,format=raw,if=none,id=NVME1'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='nvme,drive=NVME1,serial=nvme-1'/>
  </qemu:commandline>

virt-install for NVMe

If you’re spinning up VMs using virt-install, then you can also pass these in as arguments, which will automatically populate the libvirt XML file with the arguments above. Note as above, you do not add a --disk option for NVMe drives.

--qemu-commandline='-drive file=/path/to/nvme1.img,format=raw,if=none,id=NVME1'
--qemu-commandline='-device nvme,drive=NVME1,serial=nvme-1'

Confirming drive is NVMe

Your NVMe drives will show up as specific devices under Linux, like /dev/nvme0n1 and of course you can see them with tools like lsblk and nvme (from nvme-cli package).

Here’s nvme tool listing the NVMe drive in a guest.

sudo nvme list

This should return something that looks like this.

Node          SN      Model           Namespace Usage                   Format         FW Rev  
------------- ------- --------------- --------- ----------------------- -------------- ------
/dev/nvme0n1  nvme-1  QEMU NVMe Ctrl  1         107.37  GB / 107.37  GB 512   B +  0 B 1.0

SSD drives

SSD drives are slightly different. Simply add a drive to your guest as you normally would, on the bus you want to use (for example, SCSI or SATA). Then, add the required set command to set rotational speed to make it an SSD (note that you set it to 1 in qemu, which sets it to 0 in Linux).

This does require you to know the name of the device so it will depend on how many drives you add of that type. Although it generally follows a format like this, for the first SCSI drive on the first SCSI controller, scsi0-0-0-0 and for SATA, sata0-0-0, but it’s good to confirm.

You can determine the exact name for your drive by querying the guest with virsh qemu-monitor-command, like so.

virsh qemu-monitor-command --hmp 1 "info qtree"

This will provide details showing the devices, buses and connected drives. Here’s an example for the first SCSI drive, where you can see it’s scsi0-0-0-0.

                  dev: scsi-hd, id "scsi0-0-0-0"
                    drive = "drive-scsi0-0-0-0"
                    logical_block_size = 512 (0x200)
                    physical_block_size = 512 (0x200)
                    min_io_size = 0 (0x0)
                    opt_io_size = 0 (0x0)
                    discard_granularity = 4096 (0x1000)
                    write-cache = "on"
                    share-rw = false
                    rerror = "auto"
                    werror = "auto"
                    ver = "2.5+"
                    serial = ""
                    vendor = "QEMU"
                    product = "QEMU HARDDISK"
                    device_id = "drive-scsi0-0-0-0"
                    removable = false
                    dpofua = false
                    wwn = 0 (0x0)
                    port_wwn = 0 (0x0)
                    port_index = 0 (0x0)
                    max_unmap_size = 1073741824 (0x40000000)
                    max_io_size = 2147483647 (0x7fffffff)
                    rotation_rate = 1 (0x1)
                    scsi_version = 5 (0x5)
                    cyls = 16383 (0x3fff)
                    heads = 16 (0x10)
                    secs = 63 (0x3f)
                    channel = 0 (0x0)
                    scsi-id = 0 (0x0)
                    lun = 0 (0x0)

QEMU command for SSD drive

When using qemu, add your drive as usual and then add the set option. Using the SCSI drive example from above (which is on scsi0-0-0-0), this is what it would look like.

-set device.scsi0-0-0-0.rotation_rate=1

libvirt XML definition for SSD drive

Similarly, for a defined guest, add the set argument like we did for NVMe drives, that is at the bottom of the XML, before the closing </domain> tag.

  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.scsi0-0-0-0.rotation_rate=1'/>
  </qemu:commandline>

If your machine has NVMe drives specified also, just add the set args for the SSD, don’t add a second qemu:commandline section. It should look something like this.

  <qemu:commandline>
    <qemu:arg value='-set'/>
    <qemu:arg value='device.scsi0-0-0-0.rotation_rate=1'/>
    <qemu:arg value='-drive'/>
    <qemu:arg value='file=/var/lib/libvirt/images/rancher-vm-centos-7-00-nvme.qcow2,format=qcow2,if=none,id=NVME1'/>
    <qemu:arg value='-device'/>
    <qemu:arg value='nvme,drive=NVME1,serial=nvme-1'/>
  </qemu:commandline>

virt-install command for SSD drive

When spinning up machine using virt-install, add a drive as normal. The only thing you have to add is the argument for the qemu set command. Here’s that same SCSI example.

--qemu-commandline='-set device.scsi0-0-0-0.rotation_rate=1'

Confirming drive is an SSD

You can confirm the rotational speed with lsblk, like so.

sudo lsblk -d -o name,rota

This will return either 0 (for rotational speed false, meaning SSD) or 1 (for rotating drives, meaning non-SSD). For example, here’s a bunch of drives on a KVM guest where you can see /dev/sda and /dev/nvmen0n1 are both SSDs.

NAME    ROTA
sda        0
sdb        1
sr0        1
vda        1
nvme0n1    0

You can also check with smartctl, which will report the rotational rate as an SSD. Here’s an example on /dev/sda which is set to be an SSD in KVM guest.

smartctl -i /dev/sda

This shows a result like this, note Rotational Rate is Solid State Device.

=== START OF INFORMATION SECTION ===
Vendor:               QEMU
Product:              QEMU HARDDISK
Revision:             2.5+
Compliance:           SPC-3
User Capacity:        107,374,182,400 bytes [107 GB]
Logical block size:   512 bytes
LU is thin provisioned, LBPRZ=0
Rotation Rate:        Solid State Device
Device type:          disk
Local Time is:        Wed Dec 18 17:52:18 2019 AEDT
SMART support is:     Unavailable - device lacks SMART capability.

So that’s it! Thanks to QEMU you can play with NVMe and SSD drives in your guests.

13 thoughts on “KVM guests with emulated SSD and NVMe drives

  1. Mr.Pablo

    I’ve try to follow you for the nvme and get the error
    ” ‘nvme’ is not a valid device model name ”
    Can you help?

  2. Chris Post author

    Does your NVMe definition for id, serial and drive all have a number like below?

    --qemu-commandline='-drive file=/path/to/nvme1.img,format=raw,if=none,id=NVME1'
    --qemu-commandline='-device nvme,drive=NVME1,serial=nvme-1'

    If they do, can you edit your VM XML file and set the domain at the top to this:

    <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

  3. Tobi

    Hi!

    Thanks for this guide, I found it googling for ‘libvirt nvme’. I tried it on Debian and got a permissions error:

    error: internal error: qemu unexpectedly closed the monitor: 2020-06-06T09:04:38.636146Z qemu-system-x86_64: -drive file=/var/lib/libvirt/images/nvme.qcow2,if=none,format=qcow2,id=NVME: Could not open ‘/var/lib/libvirt/images/nvme.qcow2’: Permission denied

    For anyone with the same problem:
    I found your ansible role in GitHub (https://github.com/csmart/ansible-role-virt-infra) and looked through it. Apparently this has to do with apparmor permissions for the NVMe disk. The solution (taken from your ansible tasks) is to add a line to /etc/apparmor.d/abstractions/libvirt-qemu:

    /var/lib/libvirt/images/*nvme.qcow2 rwk,

    And then restart apparmor (systemctl restart apparmor.service).
    Of course the images must be named accordingly to match that rule.

  4. Dave

    @Chris it would apper that if you do qemu-kvm -device help with v4.2.0 there is no nvme listed as a valid device model. Are you using a newer version?

  5. Chris Post author

    Hi Dave, I do see it with qemu-kvm 4.2.0 on Fedora 32. Maybe your binary did not have it enabled at compile time or something?

    $ qemu-kvm -device help |grep -i nvme
    name "nvme", bus PCI, desc "Non-Volatile Memory Express"

    $ qemu-kvm -version
    QEMU emulator version 4.2.0 (qemu-4.2.0-7.fc32)
    Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

  6. Dave

    Appreciate the response, do you have any documentation on how to compile w/ nvme support? Thanks again!

  7. Mike

    Ho does the emulator stack up with native nvme? I hear some horror stories that the driver causes corruption and write speeds are very poor. Would be good to see some numbers if poss.

  8. Chris Post author

    Hey Mike,

    I don’t know about corruption, sounds more like it’d be related to a poor NVMe drive to me.

    In terms of speed, I ran the following fio test:

    sudo fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/mnt/randrw --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

    I get the following random read/write results to NVMe on my host

    read: IOPS=264k, BW=1032MiB/s (1082MB/s)(3070MiB/2976msec)
    write: IOPS=88.3k, BW=345MiB/s (362MB/s)(1026MiB/2976msec); 0 zone resets

    A Fedora 31 guest on the same host with qcow2 disk gets the following:

    read: IOPS=98.7k, BW=386MiB/s (404MB/s)(3070MiB/7960msec)
    write: IOPS=32.0k, BW=129MiB/s (135MB/s)(1026MiB/7960msec); 0 zone resets

    So seems pretty good to me… 🙂

  9. Michael

    @Chris well that looks pretty darn good. Wonder how PCIe4 nvme drives will perform.

  10. Dave

    @Chris

    With the whole disk type=’nvme’ that was added to libvirt, am I missing somethign or does libvirt STILL not allow you to define an EMULATED NVME disk as part of the “Devices” sectino of the XML spec? I just don’t see any examples of this functionality.

  11. Chris Post author

    Correct, you can’t specify an NVMe disk that way, yet. You have to do what I’ve done in this post.

Leave a Reply

Your email address will not be published. Required fields are marked *