Containers and Virtualization Podman Lead image: Lead Image © sandrobrezger, photocase.com

Managing containers with Podman

Same but Different

The Podman container management tool does not use a daemon in the background, like its counterpart Docker, and can operate in non-privileged mode. By Thorsten Scherf

When talk turns to containers, most people almost automatically think of the well-known Docker engine. For some time now, however, Podman [1] has been a serious alternative. The project was developed together with CRI-O, an implementation of the Kubernetes Container Runtime Interface (CRI). Podman is not limited to use within a Kubernetes environment, however; you also can use it to run individual containers, and you can even adopt the concept of pods, independent of Kubernetes.

Look Ma, No Daemon

Although the Podman developers have made sure that the Podman command-line tool is almost identical to Docker's, the two container engines are fundamentally different in terms of architecture. Everything in the Docker world is based on the client-server principle, whereas Podman relies on the fork-exec model. When Docker starts a new container with the docker command, the instructions are sent to a daemon running in the background, which in turn forwards the command to start the container with containerd, another Docker framework daemon. Podman, on the other hand, does completely without daemons, creating all containers as child processes of the Podman process.

UID-Based Activity Assignments

Thanks to the Podman architecture, the Linux kernel's audit subsystem can uniquely associate activities taking place inside a container with the user who started the container. As you know, every user who logs on to a system is assigned a unique user identifier (UID) that can be read from the /proc/self/loginuid file:

cat /proc/self/loginuid
1000

The login UID does not change, even if processes are started under a different ID. For example, the following command calls the file as root, but the user's login UID does not change:

sudo cat /proc/self/loginuid
1000

All processes the user starts now automatically inherit this login UID, including containers and the processes that run in these containers. The following example reads the /proc/self/loginuid file inside a Fedora container:

sudo podman run --rm fedora cat /proc/self/loginuid
1000

As you can see, the UID is still 1000. Now, you can leverage this fact to audit processes inside containers. In the auditd logfiles, the login UID is listed in the auid field:

sudo ausearch -k watch-passwd
time->Tue May 28 19:52:15 2019
type=CONFIG_CHANGE msg=audit (1559065935.923:2447): auid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=add_rule key="watch-passwd" list=4 res=1

In this way, it is very easy to see which processes were executed by which user within a container. With Docker, on the other hand, a daemon is responsible for starting containers, so the login UID is always 4294967295, which is the ID inherited from every process started by the init system and means that it is no longer possible to assign activities within a container to a particular user.

Controlling containers is not the only thing that is different between Podman and Docker. The way container images and access to container registries are managed internally also differ considerably between the two engines. Docker again uses the central Docker daemon, whereas Podman uses libraries from the container repository and is therefore independent of a daemon. These differences are completely transparent for the end user, however.

Starting Containers Without Root Privileges

Another interesting feature of Podman is that non-privileged users can start containers. Processes running as root within the container are converted on the host to the UID of the user who started the container. Other IDs need to be mapped in advance to a specific ID range. The whole thing works because Podman uses the Linux kernel's user namespace. Users who want to work with containers need an entry in the /etc/subuid and /etc/subgid files that map the IDs used in the containers to that on the host. Be careful to avoid overwriting existing IDs on the host:

echo tscherf:100000:65536 > /etc/subuid
echo tscherf:100000:65536 > /etc/subgid

In this example, the first UID in a container is mapped to UID 100000 on the host. The last UID in a container is 65536 and, therefore, 165535 on the host. Mapping the group identifiers (GIDs) works in the same way. Identifier 0 (root) is always mapped by default to the UID of the user who started the container. The following Podman call confirms the correct mapping of the UIDs and GIDs:

podman run fedora cat /proc/self/uid_map /proc/self/gid_map
0   1000       1
1   100000   65536
0   1000       1
1   100000   65536

If you now start a new container, the process inside the container is executed as root (Listing 1). On the host system (Listing 2), however, the process runs under the ID of the user who started the container.

Listing 1: Process in the Container

$ podman run -d fedora sleep 1000
 **
$ podman exec 291ee2dda43e ps -ef
UID   PID   PPID  C STIME TTY   TIME     CMD
Root  1     0     0 19:05 ?     00:00:00 sleep 1000

Listing 2: Process on the Host

$ ps -ef|grep sleep
Cherf 30328 29757 0 20:44 ?     00:00:00 sleep 1000
Cherf 30396 3353  0 20:44 pts/8 00:00:00 grep --color sleep

Users can therefore run software in a container under the root user context without problems and without having to give the software root privileges on the host system at the same time. At this point, it is worth noting that rootless containers can cause some issues (e.g., for all actions that require root privileges on the host system).

Assume you want to map one of the container's network ports to a privileged port (<1024) on the host; this operation does not work if you have started the container with normal user privileges. In this case, you need to run the container with root privileges; otherwise, the required system call to CAP_NET_BIND_SERVICE cannot be executed on the host.

Handling Container Images

By default, Podman stores container images in the /var/lib/containers directory defined in the Open Container Initiative (OCI) standard. However, because non-privileged users can also work with containers, ~/.local/share/containers/ provides a storage location in the user's home directory in which container images can also be stored.

Podman's storage component is configured in the /etc/containers/storage.conf file. Whether you create an image with Docker or Podman does not matter, because both container engines comply with the OCI standard for container images; therefore, you can import a Docker image on a Podman host without problems. The following call shows which registries are used by default:

podman info --format={{".registries"}}
map[registries:[docker.io registry.fedoraproject.org quay.io registry.access.redhat.com registry.centos.org]]

To fetch the desired image to the local system and start the container, use:

podman pull docker.io/library/httpd

Alternatively, you can start a container directly from a desired image. If it is not yet available locally, Podman will download it from one of the registries. You can also specify the registry directly in the image name:

sudo podman run -dt -p 80:80 docker.io/library/httpd httpd -D FOREGROUND

In this case, the container is launched with root privileges because container port 80 is bound to the same port on the host system. The following call shows that a web server inside the container is now listening on port 80:

links -dump http://localhost
It works!

The call to podman images or podman ps displays the existing images or active containers for a user, whereas the commands

podman rmi <image>
podman rm <container ID>

delete images and containers.

Creating OCI-Compliant Images

Podman itself can also create new OCI-compliant container images based on a Dockerfile, as shown with this very simple Dockerfile that creates a new Fedora image with an Nmap package:

cat Dockerfile
FROM fedora
RUN dnf -y update && dnf -y install nmap && dnf clean all

Next, start the build process from the directory where the Dockerfile is located:

podman build --tag nmap

Podman can also fetch a Dockerfile directly from a web server or from a GitHub repository. More information can be found on the man podman-build help page. The newly created image is then located in the local container storage repository. As Listing 3 shows, you can now use it to create an Nmap container.

Listing 3: Creating an Nmap Container

$ podman images
REPOSITORY        TAG       IMAGE ID          CREATED             SIZE
localhost/nmap    latest    53890e393585      34 seconds ago      425 MB
 **
$ podman run --rm localhost/nmap rpm -q nmap
nmap-7.70-5.fc29.x86_64

To push the image out to the quay.io public registry (by way of an example), you first need to log in to the server before you push the image into the registry:

podman login quay.io -u <user> -p <password>
podman push localhost/nmap quay.io/tscherf/nmap

If you want to create a new container image independent of a Dockerfile, you should take a look at the Buildah tool [2], which is especially useful if you want to launch new images from scripts.

Pods on a Local System

"Pods" are a term from the Kubernetes world and denote a collection of containers that share certain resources (e.g., the kernel namespace or cgroups). Because containers also use the same network namespace, the containers in a pod can communicate with each other without requiring a routable IP address. Each pod has an infrastructure container based on the k8s.gcr.io/pause image. The task of this container is to provide the resources permanently for all other containers in the pod, including, for example, kernel namespaces, as well as cgroups and network resources. Moreover, this container takes care of communication with the Podman tool.

Each container in a pod is assigned its own monitoring instance (conmon) that monitors the process(es) running in the container and ensures that Podman can connect to a TTY inside the container at any time. This monitoring instance is necessary because Podman does not use a daemon in the background, which would otherwise handle the monitoring work.

To create a new pod and then assign a container to it, enter:

podman pod create --name web

The podman pod ps command reveals whether the pod was successfully generated. At a glance, you can see that only the infrastructure container is currently available in this pod (Listing 4). To add a new container to the pod, simply enter the pod name when starting the container. The call to podman pod list then also confirms that two containers are now in the pod (Listing 5).

Listing 4: Displaying Containers in a Pod

$ podman ps -a --pod
CONTAINER ID IMAGE COMMAND        CREATED            STATUS  PORTS NAMES        POD
9062dac6ff19 k8s.gcr.io/pause:3.1 About a minute ago Created dfd09806b03c-infra dfd09806b03c

Listing 5: Two Containers in a Pod

$ podman run -d --pod web docker.io/library/httpd httpd -D FOREGROUND
 **
$ podman pod list
POD ID            NAME        STATUS        CREATED          # OF CONTAINERS    INFRA ID
dfd09806b03c      web         Created       46 seconds ago   2                  9062dac6ff19

In Podman version 1.0 and newer, you can also create a new pod directly when starting a container (Listing 6). In this case, a single command was all it took to create a new HTTPD container in a new pod. Of course, the infrastructure container was also created when the pod was created. Because the HTTPD port of the web server container is mapped to host port 8080 this time, no root privileges are required to start the container.

Listing 6: Creating a Pod at Container Startup

$ podman run -dt --pod new:httpd -p 8080:80 docker.io/library/httpd httpd -D FOREGROUND
 **
$ podman ps --pod
CONTAINER ID IMAGE                          COMMAND            CREATED         STATUS              PORTS                NAMES             POD
7e337cbed38e docker.io/library/httpd:latest httpd -D FOREGROU  19 seconds ago  Up 18 seconds ago   0.0.0.0:8080->80/tcp confident_ritchie 119a122bdab3
 **
$ podman pod list
POD ID            NAME        STATUS        CREATED          # OF CONTAINERS    INFRA ID
119a122bdab3      httpd       Running       30 seconds ago   2                  b8301dde3144
 **
$ links -dump http://localhost:8080
It works!

The ability to create local pods is certainly extremely interesting for developers because it allows a microservice-based application to be distributed across multiple containers without the need for a full-fledged Kubernetes cluster. If you want to migrate the application to a cluster of this type, Podman again offers a very elegant option for doing so. A call to podman generate kube tells the tool to create a YAML file, which you can then copy to a Kubernetes system to create the containers and pods exactly as listed in that file.

Conclusions

Because the Podman syntax is almost identical to that of the docker command-line tool, Docker-savvy admins should not have much trouble getting used to Podman. Some users have even set alias docker=podman in their shell environment. Under the hood, however, the two container engines are fundamentally different. Whereas Docker relies on a powerful daemon in the background, Podman uses the fork-exec model and has some interesting features on-board.