Apptainer / Singularity
Containers for Performance-Intensive Computing
Formerly Singularity, Apptainer is the leading container system for High Performance Computing.
Apptainer, created and founded as Singularity by Gregory Kurtzer (the CEO of CIQ), brings the benefits of containerization to HPC, enabling researchers to do science.
About Apptainer
Singularity was created to bring containers to HPC much like Docker did for the enterprise. And, in relatively short order, it became the dominant HPC container system. Due to a commercial fork, it remained clear that the project would benefit from being free of corporate control, so it was moved to the Linux Foundation and renamed to Apptainer.
Apptainer enables you to easily create and run containers that package up pieces of software in a way that’s portable and reproducible. You can use it to build a container on your laptop, then run it on one of the largest HPC clusters in the world, on a single server, on company clusters...the possibilities are endless.
Now that Apptainer is hosted by the Linux Foundation, the user base continues to expand and organizations across all industries and academia are using it. Apptainer’s optimizations for performance and parallelization make it ideal for use cases such as artificial intelligence, machine learning, and compute- and data-driven analytics.
Apptainer: Verifiable Single-file Containers for Performance Intensive Computing
Apptainer is designed to securely execute applications with bare-metal performance while being portable and reproducible. An Apptainer container packages up whatever you need into a single, verifiable file. From small laboratory clusters all the way to massively-scalable HPC clusters, Apptainer provides:
Support
100% open source
Portable jobs and environments
Market-leading containers for HPC
Supply chain
Optimized for applications
Support from CIQ
CIQ is uniquely positioned to offer amazing support and pre-built binaries for Apptainer. Not only did our CEO Gregory Kurtzer create Singularity and Apptainer, but the CIQ team is full of seasoned Apptainer developers and contributors.
Our Apptainer gurus can help you and your HPC and enterprise Performance Intensive Computing teams make the best use of containers when performance matters.
Download the Apptainer Guide
Download the Apptainer Solution Guide and learn more about Apptainer and CIQ support!
Download NowFrequently Asked Questions
What is a container?
Containers are a form of virtualization. They allow you to package applications along with all dependencies (including a new root file system, environment, specialized run commands, etc.) within a portable, reproducible unit.
Container technology takes advantage of kernel and file system features unique to Linux, and therefore do not run natively on Windows or macOS. But other types of virtualization, (e.g., WSL on Windows or a slim virtual machine environment on macOS) can give the illusion of containers running natively on these operating systems.
What is the difference between containers and virtual machines (VMs)?
VMs run at a lower level than containers. The VM software stack begins with a hypervisor. This might run on top of an existing operating system, or function as a lightweight operating system itself. The hypervisor runs a new OS (kernel and all) on top of a set of virtualized hardware. The VM file system(s) sit within this new virtualized environment and processes run at the top of the software stack. All of this low-level hardware and OS virtualization means that VMs are extremely flexible (you can run Linux on Windows or Windows on macOS) but are slow and resource intensive.
Containers run at a higher level in the software stack. They don't virtualize any hardware (though they may partition off some hardware resources through cgroups) and they just share the running kernel with the host OS. The first thing that containers add to the software stack is a new root file system, and containerized processes run on top of that. Other global resources (network interfaces, process IDs, user and group IDs, etc.) may be segmented through Linux Namespaces, though this is not strictly necessary. Because containers share hardware and the kernel with the host OS, they are lightweight and lightning-fast, but they are much less flexible than VMs.
I've heard that Apptainer was built for HPC. What does that mean?
Our CEO, Greg Kurtzer, invented Apptainer to allow unprivileged users to run containers safely in production environments. Around 2015, popular container platforms could not be used securely in multi-tenant environments with lots of unprivileged users (like HPC). The idea of "rootless" containers was in its infancy, and the technology needed to make them widely available (User Namespaces) would not be available in most HPC environments for years to come. Apptainer tackled this problem with a totally new security paradigm that leverages well-known kernel and file system features to enforce existing permissions.
While Greg was creating Apptainer, he decided to rethink the whole concept of containers from an HPC perspective. This resulted in a container platform with innovative HPC-friendly architecture, sensible defaults, and a unique single-file container system called the Singularity Image Format (SIF). Many of Apptainer's innovations remain unique among container platforms.
Why "reinvent the wheel" with a new container platform for HPC?
When we first created Apptainer, some wondered why we didn't just modify an existing container platform to work within an HCP environment. Unfortunately, that's not as easy as it may seem. Container platforms like Docker have low-level architectural designs that are fundamentally incompatible with HPC environments. For instance, you must either be a privileged user or be added to a group that confers privileges to use containers. And containers are managed via a root-owned daemon process that can easily run afoul of batch scheduling systems like Slurm and PBS. We could develop patches and workarounds, but ultimately we would be attempting to shoehorn a tool into an environment beyond its intended scope.
Instead, we decided to use (or in this case create) the right tool for the job. And the near universal adoption of Apptainer within the HPC community over the years has shown that this was the right decision!
What is the relationship between Apptainer and Singularity?
TL;DR
Apptainer is Singularity. It was renamed upon joining the Linux Foundation.
In 2021 a private company forked Singularity, calling the new project SingularityCE. Shortly thereafter, the open source community asked the Linux Foundation to adopt Singularity as a hosted project. The Linux Foundation agreed upon the condition that the name of the project be changed to avoid confusion with the corporate fork.
What is the difference between OCI (Docker) images and SIF (Apptainer) images?
To understand the difference between OCI and SIF container images, it is useful to think about where these formats originated.
In 2015, Docker (the company) launched the Open Container Initiative (OCI). Over time, this group proposed several different standards, including a standard image format, a standard way for platforms to run containers, and ultimately standards for registries that host and serve container images.
The OCI image format is a *bundle of layers *(tarballs) along with a manifest specifying how an overlay file system should combine the layers into a container. The main advantage of the OCI format is efficient use of disk space through data deduplication. An arbitrary number of containers can share the same base layer(s). This means you can load up a bunch of containers onto a cloud VM (for instance) with very limited disk space.
As the OCI standard was being created, the Apptainer community was separately developing containers for HPC. Apptainer initially used ext3 and then squashfs as the default container image format. Over time, it began to dawn on community members and developers that the single file container format creates an opportunity for several unique features. The Singularity Image Format (SIF) was developed as a way to exploit this power to its full extent.
Because SIF containers are simply files, they can be managed (moved, renamed, distributed, etc.) without any specialized software. But the SIF file format itself enables other features. SIF files can be cryptographically signed, and the signature is added to the file itself allowing the signature(s) to be verified without using an external registry or key server. Likewise, containers in SIF files can be encrypted without the use of a separate registry and encrypted containers can be run without decrypting their contents on disk. These features come with the obvious drawback of disk space. Since Apptainer doesn't use layers, it doesn't benefit from the data deduplication that layers enable. OCI containers are easy to convert to SIF files, and Apptainer does this under the hood on your behalf. However, conversion from SIF back to OCI format is not really possible because the layers are lost in the conversion to SIF. SIF files can be stored on OCI registries through the protocol OCI Registry As Storage (ORAS), so there is no need to create OCI container images if you are using Apptainer to run them.
Does orchestration (like Kubernetes or Docker Swarm) work with Apptainer?
There have been several projects to integrate Apptainer with orchestration tools or create new ones. This is difficult since most orchestration tools are made to integrate with container platforms that adhere to the OCI standard (see section above). The community has been somewhat ambivalent to these projects, and they are not well maintained. This may be because existing tools like Docker and Podman already work well with microservice-based orchestration, so there is no need for another container platform. However, the HPC community is beginning to realize that container orchestration can be beneficial to their workflows as well. This is creating a need for job-based (as opposed to service-based) orchestration. CIQ is developing Fuzzball as a job-based container orchestration platform to fill this emerging need.
I've heard the phrase "Integration over Isolation." What does it mean?
The Apptainer architecture and defaults were designed to work well with HPC workflows. To achieve this, the developers decided to use the minimum amount of isolation necessary to run containers. When you run an Apptainer container, you are the same user inside the container as you are outside. By default, you also have access to your home directory and tmp within the container. Namespace isolation is also minimal in Apptainer. By default, only a new mount namespace is used by the container. This means that the host network is accessible within the container, interprocess communication works across the container boundary, the PID table shows both host and container processes, etc. This intelligent integration model allows users to bring new applications into HPC environments while still accessing their data and interacting with the host in an easy intuitive way.
I just downloaded a new container and I can't find the software inside of it. How do I find the program I want to run?
It can be a little tricky to find things in a new container if the documentation is poor or non-existent. You can start by checking the PATH environment variable to see if any new directories with executable files have been added. If the PATH hasn't been changed, search for common places where user software might be installed (like /opt or /usr/local). If you know the name of the program or some executable file, you can use the find command like this. Just make sure you don't have any big file systems mounted into the container when you run the command.
$ find / -name *program_name* 2>/dev/null
If you still can't find the program you are looking for, it's possible that the container author installed the program in a user's home directory or in root's home directory. This can be difficult to diagnose because of the way in which Apptainer handles your identity. If all else fails, you can always convert the container to a sandbox (i.e., a directory) and search for your program there. The following command will convert the container for you.
$ apptainer build --sandbox directory_name container.sif
If directories like home, root, or tmp contain files or sub directories, that might be where your program is hiding.
Can Apptainer run nested containers? (Or other container runtimes like Podman?)
On modern systems, Apptainer runs without any elevated privileges. This makes it relatively straightforward to run nested containers. Bear in mind that Apptainer disallows suid-bit executables by default, so if your container runtime requires this capability, it will not run nested within an Apptainer container.
If you want to execute Podman within Apptainer the following command should work.
$ mkdir var run
$ apptainer shell --fakeroot --bind var:/var,run:/run podman.sif
Apptainer> podman --runtime crun run -v /sys:/sys --network=host
--cgroups=disabled -ti alpine /bin/sh
For a detailed explanation of these commands check out this blog post.
How can I submit a containerized job through a workload manager like Slurm or PBS?
Because of the way that Apptainer is architected, it's easy to submit containerized jobs to a workload manager. Apptainer doesn't rely on a daemon process to spawn and manage containers, so you don't have to worry about your containers "escaping" the workload manager. By default, Apptainer doesn't do things like manage cgroups or enter new PID, network, or IPC namespaces. This means that your container will not conflict with operations carried out by the batch scheduling system and that containerized processes will continue to communicate with one another as intended.
How can I run my MPI-enabled programs in Apptainer?
Message Passing Interface (MPI) is a primary technology underlying High Performance Computing (HPC). Apptainer is designed to work with MPI natively. Containerized MPI processes can either be launched by a matching MPI framework on the host system, or through a tool like Slurm compiled with support for one of the Process Management Interface (PMI) standards. Using PMI allows you to run fully containerized MPI-enabled programs without the need for MPI to be installed on the host. More information can be found in this detailed blog post.
How do I access ports for running services in my container?
By default, Apptainer does not enter a new network namespace or attempt to virtualize network interfaces in any way. This is one of the aspects of "integration over isolation." As a result, processes have access to all of the ports within the container as though they were running on the host. This is generally desirable behavior for HPC users who want to access their containerized processes without any additional configuration. But it may be confusing to users of other container technologies.
How do I view graphics produced by containerized programs?
Typically, graphics produced by containerized programs can be viewed without any additional configuration. It may be necessary to install tools like X11 or mesa within the container to support graphics since these will not be available otherwise.
In the case of GPU accelerated graphics, the situation may be more complicated. Typically, these programs work properly simply by adding the --nv option to the apptainer command. This causes libraries from the host system to be passed into the container to operate the GPU. However, the --nv option was designed with compute workflows in mind and has not been extensively tested with graphical processes. Therefore, it may be necessary to edit the configuration file nvliblist.conf or rocmliblist.conf (for NVIDIA and AMD respectively) to enable graphical workflows.
What is the best strategy for importing libraries and executables from the host or from other containers into my container?
With the exception of special cases (like the GPU driver libraries), the best strategy is usually to avoid bind-mounting libraries and executables from the host or from other containers into your container. These binaries have been compiled using specific versions of the GNU C libraries (glibc) that are likely to be different from those present in your container. Even if the glibc in the container matches that needed by the binaries that have been bind-mounted into it, this limits your container and renders it non-portable.
In the case of multiple different applications that must all share the same set of libraries, it is usually a better option to install everything in the same container, or to duplicate the libraries in multiple containers. This has the disadvantage of producing a larger container or using more disk space since the same data is duplicated in multiple containers. But these drawbacks are usually outweighed by the advantages of simplicity and portability.
In extreme cases where container size and disk space constraints prevent this strategy, it may be possible to use nested containers to bind-mount programs from one or more containers into other containers. In this use case, an entire set of containers would have programs compiled against the same set of libraries, and the appropriate containers would be bind-mounted into the container with the base set of libraries at runtime. However, while this strategy is theoretically plausible, it has not been extensively tested.
How do I containerize applications transparently on behalf of users I support?
It is often the case that a staff scientist or administrator must install software on behalf of an end user. In this scenario, the administrator may find it useful or necessary to install the desired software in a container, but the end user may not have sufficient expertise to use the software through a container platform. In this situation, the administrator wants to expose the containerized application so the container becomes transparent to the end user and the application runs just as it would if it were installed on the host. In this situation, an administrator might find it useful to create a small wrapper script that sets up the environment for containerized commands, and then translates specific commands into their containerized counterparts. A useful way to accomplish this is to create a self-referential bash script and make symlinks of specific commands to use it. By arranging the script, container, and symlinks carefully within a nested directory structure, applications containerized in this manner can be managed by a module system and users can run containerized commands without knowing anything about Linux containers themselves. Examples of this strategy can be found at this GitHub repo created by the NIH HPC.
Get in Touch
Reach out
Ask a question, request a demo, or say hi by filling out the form.
Mailing Address:
1050 N Hills Blvd. Suite 61180 Reno, NV 89506
Contact:
Phone: (800) 220-5243
Email: info@ciq.com