Using Ansible and dynamic inventory to manage OpenStack TripleO nodes

TripleO based OpenStack deployments use an OpenStack all-in-one node (undercloud) to automate the build and management of the actual cloud (overcloud) using native services such as Heat and Ironic. Roles are used to define services and configuration, which are then applied to specific nodes, for example, Service, Compute and CephStorage, etc.

Although the install is automated, sometimes you need to run adhoc tasks outside of the official update process. For example, you might want to make sure that all hosts are contactable, have a valid subscription (for Red Hat OpenStack Platform), restart containers, or maybe even apply custom changes or patches before an update. Also, during the update process when nodes are being rebooted, it can be useful to use an Ansible script to know when they’ve all come back, services are all running, all containers are healthy, before re-enabling them.

Inventory script

To make this easy, we can use the TripleO Ansible inventory script, which queries the undercloud to get a dynamic inventory of the overcloud nodes. When using the script as an inventory source with the ansible command however, you cannot pass arguments to it. If you’re managing a single cluster and using the standard stack name of overcloud, then this is not a problem; you can just call the script directly.

However, as I manage multiple clouds and each has a different Heat stack name, I create a little executable wrapper script to pass the stack name to the inventory script. Then I just call the relevant shell script instead. If you use the undercloud host to manage multiple stacks, then create multiple scripts and modify as required.

cat >> inventory-overcloud.sh << EOF
#!/usr/bin/env bash
source ~/stackrc
exec /usr/bin/tripleo-ansible-inventory --stack stack-name --list
EOF

Make it executable and run it. It should return JSON with your overcloud node details.

chmod u+x inventory-overcloud.sh
./inventory-overcloud.sh

Run simple tasks

The purpose of using the dynamic inventory is to run some Ansible! We can now use it to do simple things easily, like ping nodes to make sure they are online.

ansible \
--inventory inventory-overcloud.sh \
all \
--module-name ping

And of course one of the great things with Ansible is the ability to limit which hosts you’re running against. So for example, to make sure all compute nodes of role type Compute are back, simple replace all with Compute.

ansible \
--inventory inventory-overcloud.sh \
Compute \
--module-name ping

You can also specify nodes individually.

ansible \
--inventory inventory-overcloud.sh \
service-0,telemetry-2,compute-0,compute-1 \
--module-name ping

You can use the shell module to do simple adhoc things, like restart containers or maybe check their health.

ansible \
--inventory inventory-overcloud.sh \
all \
--module-name shell \
--become \
--args "docker ps |egrep "CONTAINER|unhealthy"'

And the same command using short arguments.

ansible \
-i inventory-overcloud.sh \
all \
-m shell \
-ba "docker ps |egrep "CONTAINER|unhealthy"'

Create some Ansible plays

You can see simple tasks are easy, for more complicated tasks you might want to write some plays.

Pre-fetch downloads before update

Your needs will probably vary, but here is a simple example to pre-download updates on my RHEL hosts to save time (updates are actually installed separately via overcloud update process). Note that the download_only option was added in Ansible 2.7 and thus I don’t use the yum module as RHEL uses Ansible 2.6.

cat >> fetch-updates.yaml << EOF
---
- hosts: all
  tasks:
    - name: Fetch package updates
      command: yum update --downloadonly
      register: result_fetch_updates
      retries: 30
      delay: 10
      until: result_fetch_updates is succeeded
      changed_when: '"Total size:" not in result_fetch_updates.stdout'
      args:
        warn: no
EOF

Now we can run this command against the next set of nodes we’re going to update, Compute and Telemetry in this example.

ansible-playbook \
--inventory inventory-overcloud.sh \
--limit Compute,Telemetry \
fetch-updates.yaml

And again, you could specify nodes individually.

ansible-playbook \
--inventory inventory-overcloud.sh \
--limit telemetry-0,service-0,compute-2,compute-3 \
fetch-updates.yaml

There you go. Using dynamic inventory can be really useful for running adhoc commands against your OpenStack nodes.

Leave a Reply

Your email address will not be published. Required fields are marked *