Security Apptainer Podman CIQ

This is the first blog post in a 3-part series that compares and contrasts the security features of Apptainer and Rootless Podman.

Container security is a huge topic encompassing many different facets. Broadly speaking, Apptainer and Podman both provide a “secure“ platform to run containers. But there are subtle differences in the methods they employ to handle various aspects of container security. Comparing and contrasting them can give us a better sense of the strengths and weaknesses of each, and a better feel for the types of security features that container technologies implement.

We’ve decided to write a short series (a miniseries{?}) of blog posts to focus on a few specific security-related features common to both container platforms.

  • leveraging the User Namespace (both implicitly and explicitly)
  • unprivileged installation of the runtime itself
  • cryptographically signing and verifying containers
  • creating and running encrypted containers
  • several other miscellaneous topics

But before we dive into this list, let’s review a little bit about the history of Apptainer and Podman.

A Historical Perspective

Apptainer: built from scratch for unprivileged users in multi-tenant environments

Apptainer (which changed its name from Singularity upon joining the Linux Foundation) grew out of the need to run containers securely in multi-tenant systems where most users are untrusted (i.e., HPC). Apptainer addressed this need before User Namespaces were widely available to HPC users (since many HPC sites were running on Linux distributions that lacked User Namespace support or had it disabled). The first GitHub commit occurred in October of 2015, and the 1.0 release was made in April 2016 (though the 2.0 release in June of that year is the first release that resembles modern Apptainer).

Apptainer differs from Docker (the dominant container platform at the time) in several respects. It does not manage containers via a daemon process. The container format is a single file (the Singularity Image Format or SIF). Maybe most important for our discussion, Apptainer’s privilege model is different from any other container platform. It simply maps the same user from the host into the container and ensures that the user can not escalate privileges once they are in the containerized environment (even if they are able to do so outside of the container).

Podman: a daemonless and rootless alternative to Docker

Podman grew out of the desire to create a drop-in replacement for Docker that would function without a daemon, and manage privileges (when possible) by running in a new User Namespace. Early GitHub commits show that Podman traces its origins to the cri-o container runtime used to manage containers in Kubernetes. It started as an internal project at Red Hat and was then released as an open source project in April of 2018. The 1.0.0 release arrived in January 2019. Among other features it included the ability to sign containers with GPG keys.

A note on “rootless“ containers

Around the time that Podman was getting its start, there was a lot of buzz around the idea of “rootless“ containers. The basic idea was for unprivileged users to be able to run containers without gaining any privilege on the host system. You could argue that Apptainer was a pioneer in this area since it was providing unprivileged users the ability to run containers safely before the idea of “rootless” took hold. But Apptainer was not considered technically rootless by the group of people who coined the term because it used a set-UID bit (suid) helper to mount the container file system and present it to containerized processes as the root file system. (You could argue that this is an arbitrary distinction since “rootless“ containers that leverage the User Namespace also use newuidmap and newgidmap which are themselves suid programs.) Anyway, now that Apptainer can leverage the User Namespace to provide the privilege needed for these actions it's evidently in the rootless club. 🤷‍♂️

Whatever you call it, Apptainer has been providing containers to unprivileged users (even those without access to User Namespaces) since 2016.

Using the User Namespace

Whoa! Insert needle scratch sound. What is the User Namespace? What is a namespace even??

From the Linux Programmer’s Manual entry on namespaces…

A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.

So namespaces are one of the core technologies that underlie containers.

There are lots of different namespaces because there are lots of different resources that can be abstracted in this way. Let’s take the PID Namespace as a concrete example. Every process that runs in Linux has an ID number. These process IDs or PIDs are tracked by the kernel and kept in table. Consider the following example.

[root@demobox ~]# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 Sep06 ?        00:00:01 /usr/lib/systemd/systemd rhgb --switched-root --system --deserialize 31
root           2       0  0 Sep06 ?        00:00:00 [kthreadd]
root           3       2  0 Sep06 ?        00:00:00 [rcu_gp]
root           4       2  0 Sep06 ?        00:00:00 [rcu_par_gp]
root           5       2  0 Sep06 ?        00:00:00 [slub_flushwq]
root           6       2  0 Sep06 ?        00:00:00 [netns]
root           8       2  0 Sep06 ?        00:00:00 [kworker/0:0H-events_highpri]
root          10       2  0 Sep06 ?        00:00:00 [mm_percpu_wq]
root          12       2  0 Sep06 ?        00:00:00 [rcu_tasks_kthre]
root          13       2  0 Sep06 ?        00:00:00 [rcu_tasks_rude_]
root          14       2  0 Sep06 ?        00:00:00 [rcu_tasks_trace]
root          15       2  0 Sep06 ?        00:00:00 [ksoftirqd/0]
root          16       2  0 Sep06 ?        00:00:02 [rcu_preempt]
root          17       2  0 Sep06 ?        00:00:00 [migration/0]
root          18       2  0 Sep06 ?        00:00:01 [kworker/0:1-events]
root          19       2  0 Sep06 ?        00:00:00 [cpuhp/0]
root          20       2  0 Sep06 ?        00:00:00 [cpuhp/1]
root          21       2  0 Sep06 ?        00:00:00 [migration/1]
root          22       2  0 Sep06 ?        00:00:00 [ksoftirqd/1]

Now consider what happens if we execute the unshare command (as root) to enter a new PID Namespace.

[root@demobox ~]# unshare --pid --fork --mount-proc bash

[root@demobox ~]# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 12:06 pts/0    00:00:00 bash
root          22       1  0 12:06 pts/0    00:00:00 ps -ef

That PID table looks a lot different!

When you enter the new PID Namespace you are presented with a different PID table because the Linux kernel has abstracted that resource for use by your processes (like ps above). There are also Namespaces for Cgroups, Inter-Process Communication (IPC), the Network, Mount Points, Time, User and Group IDs (User), and host and domain name (UTS). These namespaces are used by container platforms to do things like present a new root file system (via the Mount Namespace) and accomplish port mapping (via the Network Namespace), etc.

Now, note that I completed the above example as root. If I try to enter a new PID Namespace with the same command as an unprivileged user I get rejected.

[demouser@demobox ~]$ unshare --pid --fork --mount-proc bash
unshare: unshare failed: Operation not permitted

This would be a good time to introduce the User Namespace. 😄

In the case of the User Namespace, the resources that are abstracted are user and group IDs (UIDs and GIDs respectively). This allows you to map your UID and GID on the host to a different UID and GID within the namespace. In practice, this is often used to pretend to be UID 0 (root). The User Namespace can be leveraged by an unprivileged user (on a properly configured system) via the newuidmap and newgidmap suid executables. Observe:

[demouser@demobox ~]$ unshare --user --map-user=0 --map-group=0

[root@demobox ~]# id
uid=0(root) gid=0(root) groups=0(root),65534(nobody) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

Wow! I’m root. Let me try that PID Namespace command again.

[root@demobox ~]# unshare --pid --fork --mount-proc bash

[root@demobox ~]# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 12:09 pts/0    00:00:00 bash
root          27       1  0 12:09 pts/0    00:00:00 ps -ef

Through the magic of the User Namespace, I’m now able to use the PID Namespace (without privilege) to create a new abstracted PID table.

So to recap, Namespaces allow you to abstract and isolate some system resource(s) to a given set of processes, and the User Namespace is special because it allows you to elevate privileges in a Namespace, which allows you to do things like use the other Namespaces. Rootless container platforms enter the User Namespace and then use it to set up a new Mount Namespace so the container file system looks like the root file system to containerized processes.

Now let’s return to our regularly scheduled program to take a look at User Namespaces in the context of Apptainer and Podman. Stay tuned for the next episode in this blog-post miniseries! 😄

Dave Godlove
+ posts

Similar Posts