Using Ansible and dynamic inventory to manage OpenStack TripleO nodes

TripleO based OpenStack deployments use an OpenStack all-in-one node (undercloud) to automate the build and management of the actual cloud (overcloud) using native services such as Heat and Ironic. Roles are used to define services and configuration, which are then applied to specific nodes, for example, Service, Compute and CephStorage, etc.

Although the install is automated, sometimes you need to run adhoc tasks outside of the official update process. For example, you might want to make sure that all hosts are contactable, have a valid subscription (for Red Hat OpenStack Platform), restart containers, or maybe even apply custom changes or patches before an update. Also, during the update process when nodes are being rebooted, it can be useful to use an Ansible script to know when they’ve all come back, services are all running, all containers are healthy, before re-enabling them.

Inventory script

To make this easy, we can use the TripleO Ansible inventory script, which queries the undercloud to get a dynamic inventory of the overcloud nodes. When using the script as an inventory source with the ansible command however, you cannot pass arguments to it. If you’re managing a single cluster and using the standard stack name of overcloud, then this is not a problem; you can just call the script directly.

Continue reading Using Ansible and dynamic inventory to manage OpenStack TripleO nodes

Using network namespaces with veth to NAT guests with overlapping IPs

Sets of virtual machines are connected to a virtual bridges (e.g. virbr0 and virbr1) and as they are isolated, can use the same subnet range and set of IPs. However, NATing becomes a problem because the host won’t know which VM to return the traffic to.

To solve this problem, we can use network namespaces and some veth (virtual Ethernet) devices to connect up each private network we want to NAT.

Each veth device acts like a patch cable and is actually made up of two network devices, one for each end (e.g. peer1-a and peer1-b). By adding those interfaces between bridges and/or namespaces, you create a link between them.

The network namespace is only used for NAT and is where the veth IPs are set, the other end will act like a patch cable without an IP. The VMs are only connected into their respective bridge (e.g. virbr0) and can talk to the network namespace over the veth patch.

We will use two pairs for each network namespace.

  • One (e.g. represented by veth1 below ) which connects the virtual machine’s private network (e.g. virbr0 on into the network namespace (e.g. net-ns1) where it sets an IP and will be the private network router (e.g.
  • Another (e.g. represented by veth2 below) which connects the upstream provider network (e.g. br0 on into the same network namespace where it sets an IP (e.g.
  • Repeat the process for other namespaces (e.g. represented by veth3 and veth4 below).
Configuration for multiple namespace NAT

By providing each private network with is own unique upstream routable IP and applying NAT rules inside each namespace separately we can avoid any conflict.

Continue reading Using network namespaces with veth to NAT guests with overlapping IPs

Using Ansible to define and manage KVM guests and networks with YAML inventories

I wanted a way to quickly spin different VMs up and down on my KVM dev box, to help with testing things like OpenStack, Swift, Ceph and Kubernetes. Some of my requirements were as follows:

  • Define everything in a markup language, like YAML
  • Manage VMs (define, stop, start, destroy and undefine) and apply settings as a group or individually
  • Support different settings for each VMs, like disks, memory, CPU, etc
  • Support multiple drives and types, including Virtio, SCSI, SATA and NVMe
  • Create users and set root passwords
  • Manage networks (create, delete) and which VMs go on them
  • Mix and match Linux distros and releases
  • Use existing cloud images from distros
  • Manage access to the VMs including DNS/hosts resolution and SSH keys
  • Have a good set of defaults so it would work out of the box
  • Potentially support other architectures (like ppc64le or arm)

So I hacked together an Ansible role and example playbook. Setting guest states to running, shutdown, destroyed or undefined (to delete and clean up) are supported. It will also manage multiple libvirt networks and guests can have different specs as well as multiple disks of different types (SCSI, SATA, Virtio, NVMe). With Ansible’s –limit option, any individual guest, a hostgroup of guests, or even a mix can be managed.

Managing KVM guests with Ansible

Although Terraform with libvirt support is potentially a good solution, by using Ansible I can use that same inventory to further manage the guests and I’ve also been able to configure the KVM host itself. All that’s really needed is a Linux host capable of running KVM, some guest images and a basic inventory. The Ansible will do the rest (on supported distros).

Continue reading Using Ansible to define and manage KVM guests and networks with YAML inventories

Red Hat crippling CloudForms product and migrating users to IBM

CloudForms is Red Hat’s supported version of upstream ManageIQ, an infrastructure management platform. It lets you see, manage and deploy to various platforms like OpenStack, VMWare, RHEV, OpenShift and public cloud like AWS and Azure, with single pane of glass view across them all. It has its own orchestration engine but also integrates with Ansible for automated deployments.

As best I can tell, their CloudForms updated Statement of Direction article (behind paywall, sorry) shows that Red Hat is killing off support for non-Red Hat platforms like VMware, AWS, Azure, etc. The justification is to focus on open platforms, which I think means CloudForms will ultimately disappear entirely with Red Hat focusing on OpenShift instead.

We made a strategic decision to focus our management strategy on the future — open, cloud-native environments that promote portability across on-premise, private and public clouds.

CloudForms updated Statement of Direction

However to me this is still a big blow to users of the platform, where I’m sure most will have at least some VMWare to manage. Indeed, when implementing CloudForms at work and talking to Red Hat, they said that their most mature integration in CloudForms is with VMWare.

According to the Red Hat article, CloudForms with full platform support is being embedded into IBM Cloud Pak for Multicloud Management and users are encouraged to “migrate your Red Hat CloudForms subscriptions to IBM Cloud Pak for Multicloud Management licenses.” Red Hat’s CloudForms Statement of Direction FAQ article lays out the migration path, which does confirm Red Hat will continue to support existing clients for the remainder of their subscription.

So in short, CloudForms from Red Hat is being crippled and will only support Red Hat products, which really means that users are being forced to buy IBM instead. Of course Red Hat is entitled to change their own products, but this move does seem curious when execs on both sides said they would remain independent. Maybe it’s better than killing CloudForms outright?

We can publicly say that all our products will survive in their current form and continue to grow. We will continue to support all our products; we’re separate entities and we’re going to have separate contracts, and there is no intention to de-emphasise any of our products and we’ll continue to invest heavily in it.

Jim Whitehurst as Red Hat CEO

KVM guests with emulated SSD and NVMe drives

Sometimes when you’re using KVM guests to test something, perhaps like a Ceph or OpenStack Swift cluster, it can be useful to have SSD and NVMe drives. I’m not talking about passing physical drives through, but rather emulating them.

NVMe drives

QEMU supports emulating NVMe drives as arguments on the command line, but it’s not yet exposed to tools like virt-manager. This means you can’t just add a new drive of type nvme into your virtual machine XML definition, however you can add those qemu arguments to your XML. This also means that the NVMe drives will not show up as drives in tools like virt-manager, even after you’ve added them with qemu. Still, it’s fun to play with!

Continue reading KVM guests with emulated SSD and NVMe drives

Automatically updating containers with Docker

Running something in a container using Docker or Podman is cool, but maybe you want an automated way to always run the latest container? Using the :latest tag alone does not to this, that just pulls the latest container at the time. You could have a cronjob that just always pulls the latest containers and restarts the container but then if there’s no update you have an outage for no reason.

It’s not too hard to write a script to pull the latest container and restart the service only if required, then tie that together with a systemd timer.

To restart a container you need to know how it was started. If you have only one container then you could just hard-code it, however it gets more tricky to manage if you have a number of containers. This is where something like runlike can help!

Continue reading Automatically updating containers with Docker

OwnTracks recorder in a container on Fedora with Let’s Encrypt and nginx

OwnTracks Recorder is a web application which maps locations over time. Generally, it connects to an MQTT server and subscribes to owntracks/+ topics for any location updates, but it also has a built in function to receive updates over HTTP.

I have been using OwnTracks with MQTT for a while, but found it to be too unreliable on Android (disconnects in the background and doesn’t reconnect nicely). Using HTTP is supposed to be more reliable, so this is how I set it up. The idea is to use OwnTracks on Android to post directly to the OwnTracks recorder over HTTP instead of MQTT and have recorder post the MQTT messages on our behalf using LUA scripts (for Home Assistant).

Friends is an important feature (to let members of the family see where eachother is located) and fortunately it is supported in HTTP mode (but it requires a little bit more configuration).

Continue reading OwnTracks recorder in a container on Fedora with Let’s Encrypt and nginx

Enabling Docker in Fedora 31 by reverting to cgroups v1

Fedora has switched to cgroups v2 by default now, but Docker doesn’t yet support it and so fails to start. If you want to use Docker then you need to revert cgroups to v1 by adding the systemd.unified_cgroup_hierarchy=0 kernel argument.

Add systemd.unified_cgroup_hierarchy=0 to the default GRUB config with sed.

sudo sed -i '/^GRUB_CMDLINE_LINUX/ s/"$/ systemd.unified_cgroup_hierarchy=0"/' /etc/default/grub

Now rebuild your GRUB config.

If you’re using BIOS boot then it’s this.

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

If you’re running EFI, then it’s this.

sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

Now reboot and make sure Docker can start!

Use swap on NVMe to run more dev KVM guests, for when you run out of RAM

I often spin up a bunch of VMs for different reasons when doing dev work and unfortunately, as awesome as my little mini-itx Ryzen 9 dev box is, it only has 32GB RAM. Kernel Samepage Merging (KSM) definitely helps, however when I have half a dozens or so VMs running and chewing up RAM, the Kernel’s Out Of Memory (OOM) killer will start executing them, like this.

[171242.719512] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/machine.slice/machine-qemu\x2d435\x2dtest\x2dvm\x2dcentos\x2d7\x2d00.scope,task=qemu-system-x86,pid=2785515,uid=107
[171242.719536] Out of memory: Killed process 2785515 (qemu-system-x86) total-vm:22450012kB, anon-rss:5177368kB, file-rss:0kB, shmem-rss:0kB
[171242.887700] oom_reaper: reaped process 2785515 (qemu-system-x86), now anon-rss:0kB, file-rss:68kB, shmem-rss:0kB

If I had more slots available (which I don’t) I could add more RAM, but that’s actually pretty expensive, plus I really like the little form factor. So, given it’s just dev work, a relatively cheap alternative is to buy an NVMe drive and add a swap file to it (or dedicate the whole drive). This is what I’ve done on my little dev box (actually I bought it with an NVMe drive so adding the swapfile came for free).

Continue reading Use swap on NVMe to run more dev KVM guests, for when you run out of RAM

Using pipefail with shell module in Ansible

If you’re using the shell module with Ansible and piping the output to another command, it might be a good idea to set pipefail. This way, if the first command fails, the whole task will fail.

For example, let’s say we’re running this silly task to look for /tmp directory and then trim the string “tmp” from the result.

ansible all -i "localhost," -m shell -a \
'ls -ld /tmp | tr -d tmp'

This will return something like this, with a successful return code.

localhost | CHANGED | rc=0 >>
drwxrwxrw. 26 roo roo 640 Se 28 19:08 /

Continue reading Using pipefail with shell module in Ansible