While you can run containers as root
on the host, or run rootless containers as your regular user (either as uid 0
or any another), sometimes it’s nice to create specific users to run one or more containers. This provides neat separation and can also improve security posture.
We also want those containers to act as regular system services; managed with systemd
to auto-restart and be enabled on boot.
This assumes you’ve just installed Fedora (or RHEL/CentOS 8+) server and have a local user with sudo
privileges. First, let’s also install some SELinux tools.
sudo dnf install -y /usr/sbin/semanage
Setting up the system user
Let’s create our system user, placing their home dir under /var/lib
. For the purposes of this example I’m using a service account of busybox
but this can be anything unique on the box. Note, if you prefer to have a real shell, then swap /bin/false
with /bin/bash
or other.
export SERVICE="busybox"
sudo useradd -r -m -d "/var/lib/${SERVICE}" -s /bin/false "${SERVICE}"
In order for our user to run containers automatically on boot, we need to enable systemd linger support. This will ensure that a user manager is run for the user at boot and kept around after logouts.
sudo loginctl enable-linger "${SERVICE}"
Configure homedir for containers
Next, we create a data
directory to be passed in as a volume to the container (some containers may require more, but this is a good start).
sudo -H -u "${SERVICE}" bash -c "mkdir ~/data"
We need to set some SELinux context on the home directory, otherwise rootless containers won’t run. This will change the service account’s directory under /var/lib
from var_lib_t
to user_home_dir_t
. It also sets the data
directory to be of type container_file_t
so that containers will be able to access it (technically this isn’t necessary if you use the :z
or :Z flag for the volume when running the container, but I’m keeping it in for broader context).
sudo semanage fcontext -a -t user_home_dir_t \
"/var/lib/${SERVICE}(/.+)?"
sudo semanage fcontext -a -t container_file_t \
"/var/lib/${SERVICE}/data(/.+)?"
sudo restorecon -Frv /var/lib/"${SERVICE}"
Enable rootless containers
By default, system users do not get any subuid ranges which means it will not be able to run rootless containers. Setting this up is done manually with a little bit of bash magic.
NEW_SUBUID=$(($(tail -1 /etc/subuid |awk -F ":" '{print $2}')+65536))
NEW_SUBGID=$(($(tail -1 /etc/subgid |awk -F ":" '{print $2}')+65536))
sudo usermod \
--add-subuids ${NEW_SUBUID}-$((${NEW_SUBUID}+65535)) \
--add-subgids ${NEW_SUBGID}-$((${NEW_SUBGID}+65535)) \
"${SERVICE}"
Great! Now we have our system user ready to go.
Switch to system user
Let’s switch to our system user (note this is slightly more complicated as we have /bin/false
as the shell, so this puts us in the right homedir).
sudo -H -u "${SERVICE}" bash -c 'cd; bash'
Running a rootless container
We have a dedicated user which can run rootless containers, so when we start a container, we can tell it to run as root
with the --user 0:0
option (or -u 0:0
for short). This way the process in the container will be actually run as our system user on the host.
OK, now let’s run a container! Note we are running this in --detached
(-d
for short) mode so that it runs in the background. We’re also enabling interactive mode with --interactive
(-i
for short) and allocating a pseudo terminal with --terminal
(-t
for short) which is required for busybox to work. You may recall from earlier posts that the :z
option after the volume sets an SELinux context on the data directory, explicit to this container via MCS labels.
podman run -u 0:0 -dit --name busybox -v data:/data:z busybox
Do a simple test to make sure we can connect to the running container.
podman exec busybox sh -c 'echo -n "In this container, I am ";id -un'
You should see that you are root
…
In this container, I am root
Managing and enabling the container with systemd
OK, so we can create a dedicated user on the host, and we can run a container, great! But how do we get that non-root user to automatically start their container on boot? Enter systemd
.
In order to interact with systemd, we must ensure XDG_RUNTIME_DIR
is set (this is because we switched user, if we ssh
in instead, it will be set up for us, but our system user has no shell so that’s not possible).
export XDG_RUNTIME_DIR=/run/user/"$(id -u)"
You should be able to connect to systemd now, let’s test it.
systemctl --user
Remember when we created the account we enabled linger support? That’s critical when running containers without an actual login.
Let’s make the user systemd directory.
mkdir -p ~/.config/systemd/user/
Use podman
to generate a systemd service file.
podman generate systemd --restart-policy always --name busybox > \
~/.config/systemd/user/container-busybox.service
sed -i s/^KillMode=.*/KillMode=control-group/ \
~/.config/systemd/user/container-busybox.service
Next, we reload systemd so that it can see the new service.
systemctl --user daemon-reload
Now we are able to interact with the container using systemd. Let’s enable it on boot and check the status!
systemctl --user enable --now container-busybox
systemctl --user status container-busybox
On boot, this service should auto-start and can be managed via systemd.
So the final step is to reboot the host, switch back to the user and ensure the container is running.
16 thoughts on “Rootless podman containers under system accounts, managed and enabled at boot with systemd”
This is neat, I like it.
Question. How about CI/CD flow here. Assuming I have a gitlab-ci and can execute SSH via pubkey with that service user. If I execute podman rm, then pull fresh and run fresh the image, will the systemd still work after reboot (and autostart) ?
It should, so long as the name of your container is the same as that’s what systemd uses (check the service file you create).
-c
Very helpful article, thanks!
For many containers, having the systemd services scattered over just as many users and home dirs is cumbersome to manage. Do you know if its possible to leverage the systemd User= and Group= directives to run rootless containers from the systemd –system instance? Or would be the preferred approach to just run all containers under the same (non-root-) user?
Hi Kilian, I’m not sure about that but is is probably possible to run as a different user. If I get some time I’ll test it out and post an update.
Thanks a lot! Great tutorial, worked for me like a charm!
Great!
Thanks! This was very helpful in getting a rootless podman deployment of keycloak up and running.
Great, glad it helped!
This is brilliant and made my day. I express my deepest gratitude to you!
Great! Glad to hear it.
This post is brilliant. I have found pieces of this info all over, but here it all is in one place. Thank you!
Hi Aaron, glad it helped! Thanks for the comment.
Hi! I came to this great overview late. Do you have/time desire to contrast your approach with an alternative approach which appears simpler: making the systemd unit dependent on a specific user service? I have outlined this approach here https://unix.stackexchange.com/a/740642/566350
Hi Mark, I haven’t tried that but if I get some time I can give it a shot. Thanks!
Hi, I followed your recommendations until systemd units. To persist container across reboots I put a user systemd unit in user’s folder ~/.config/systemd/user/podman-restart.service
This is a slightly modified version of system’s podman-restart, I only added “Environment=PODMAN_SYSTEMD_UNIT=%n” line. Did not need XGD related variables to work. The original system starts rootful containers at boot (/usr/bin/podman $LOGGING start –all –filter restart-policy=always), but cannot start other user’s containers.
Still, systemd units for standalone containers are deprecated (podman generate systemd), and quadlet architecture should be used. This should be noted in this post.
FYI:
Add the “-F” option to the “useradd” command and you can skip the whole manual subuid/subgid bit.
Also, since the Podman systemd generation is deprecated now, maybe update the article to state that since Podman 3.3 you can use “systemctl –user enable podman-restart” (once) and all containers with the proper restart option will restart automagically.