Apptainer: Deep Dive, Use Cases, and Examples

This webinar will focus on the key features that differentiate Apptainer (formerly Singularity) from other runtimes with live demonstrations.

Webinar Synopsis:

Speakers:

  • Zane Hamilton, Vice President of Sales Engineering, CIQ
  • Dave Godlove, Solutions Architect, CIQ

Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.

Full Webinar Transcript:

Zane Hamilton:

Good morning, good afternoon, and good evening, wherever you are. Thank you for joining us. My name is Zane Hamilton, and I'm the Director of Sales Engineering here at CIQ. For those of you who are unfamiliar with CIQ, we're a company that's focused on powering the next generation of software infrastructure, leveraging the capabilities of cloud, hyperscale and HPC from researchers to the enterprise. Our customers rely on us for the ultimate Rocky Linux, Warewulf, and Apptainer support escalation. We provide deep development capabilities and solutions, all delivered in the collaborative spirit of open source. In today's webinar, we're going to be deep diving, Apptainer and use case examples with David Godlove

Dave Godlove:

Hey everybody.

Zane Hamilton:

Hey Dave. Welcome back.

Dave Godlove:

Thanks.

Zane Hamilton:

If you don't mind introducing yourself, and tell us who you are, how you got involved in Apptainer, just a little bit of a background, if you don't mind.

Dave Godlove:

Sure, absolutely. I used to be a neuroscientist working at the National Institutes of Health. While I was there, I switched over and became a staff scientist working in the High Performance Computing Center, the intramural High Performance Computing Center, which is known as Biowulf. This is where I first came across Greg when he was first developing Singularity, and the two of us have been working fairly closely together on Singularity, which is the project, which ultimately became Apptainer for several years now. For a while, I was the release manager and the community manager for Singularity, and I'm excited now to come and talk to you about Apptainer today.

Zane Hamilton:

It's fantastic. Not just a user, but actually a contributor you were involved at a project level in the community. That is fantastic. What do you have for us today, Dave

Dave Godlove:

Yeah, I've seen it from a lot of different angles.

Zane Hamilton:

That's awesome. What are you going to show us?

Linux Containers [02:01]

Dave Godlove:

Today, I'd like to show you a really broad overview of a lot of the different things that Apptainer can do. To give you a little bit of a look and feel. I really want to focus on the things that distinguish Apptainer from other container platforms, which are also here in this space. Let me just go ahead and share my screen. Okay. We can get started. I just wanted to do a quick little level set to begin with. When I started giving talks like this several years ago, I would usually have to start by talking a little bit about, or maybe a lot bit about what Linux containers are. These days, I feel like most people have a pretty good handle on what a Linux container is.

If you don't, I think I can summarize it really quickly by just saying, you can conceptualize a Linux container as a lightweight virtual machine. It's a little bit different from a virtual machine in that it shares the kernel with the underlying host system. And so that does make it a little bit different, makes it a little bit more lightweight and a lot more high performant. Linux containers utilize some features of the Linux kernel and some other Linux features of the file system, for instance. Basically, take these features and wrap them up together and present them to the user in a nice package, which makes containerization easy. You really see them everywhere these days. They're ubiquitous within computing. What is Apptainer now? If you're a regular watcher of this webinar, then I don't have to tell you what Apptainer is, but maybe you're not.

It's a container solution, which was originally built for high performance computing by Greg Kurtzer when he was working at Lawrence Berkeley National Lab several years ago. It has a focus on security and ease of use for HPC users. This information is a little old. It has a really large and active community, lots of installations. Really, it's the Linux foundation successor to the Singularity Container platform. If you're familiar with Singularity, it's the Linux Foundation successor to that, and you can learn more here, obviously.

Zane Hamilton:

Have you seen a growth in the community since it went into the Linux Foundation? Have you had more people join in?

Dave Godlove:

I think it's actually kind of hard to say because the community had already gotten so big. Singularity became a ubiquitous tool, which is used everywhere before it was pulled into the Linux Foundation umbrella. At this point, if the community grew, it's already so huge. It's a little bit difficult to track.

Zane Hamilton:

Excellent. Thank you.

Dave Godlove:

It's like saying, have you seen more people use the copy command in Linux? It's pretty widely used.

Zane Hamilton:

Makes sense.

What Makes Apptainer Different [05:25]

Dave Godlove:

So what makes Apptainer different? There are a lot of different container run times. What differentiates Apptainer in this space? I'm going to focus on three main things that I think sum it up. The first is it has to have a simple straightforward security model, which is based on the idea that you're the same UID and GID inside the container as you are outside. And that whatever privileges you have when you enter the container, those are the same privileges you have inside the container. You can't escalate your privileges.

The second thing that really differentiates Apptainer is it has a focus on integration over isolation. It intelligently integrates with the host operating system. This is a little unique within the container space, I think. I'm going to go over some examples of that as I do the demo. Finally, the Singularity Image Format or SIF, this is a huge differentiator for Apptainer. Essentially within, I'm going to show demos of this too, but within, the Apptainer slash Singularity ecosystem,  a container is a single file, which allows you to do all kinds of things such as managing your containers easily to cryptographically, signing your containers, to encrypting them and so on. I'm going to go through some demos of that as well. Once again, I'm here representing CIQ. It's not just Apptainer that CIQ develops, it is also, Rocky Linux, Warewulf. We have some things cooking with something called Fuzzball, which we are going to be releasing pretty soon. I just wanted to make sure everybody is aware of that. All right, let me go ahead and just jump into a terminal and start playing around.

Apptainer Help Section [07:22]

Okay. The first thing I wanted to show everybody is just give a quick look and feel of what it looks like and feels like to use Apptainer. If you run "help" you get a description of what Apptainer is, I've already given that to you. You get some options here, which are mostly there to just control output to the screen. Here is really the meat of the help section. Here are all the different sub commands that you can run with Apptainer, obviously there's lots of different things you can do. Apptainer is a very big, full featured program with a lot of different options, and that is its philosophy. That's what it's supposed to do, is be a very full featured place to use containers. I am going to focus today on things like shell, run, exec. I'm going to do a little building, and then toward the end, hopefully if there's time, I'm going to show you some examples that use the sign and verify commands. I'm not going to cover all the different things that Apptainer can do. Just to make sure that everybody's level set, I'm going to show you the most trivial, most easy example with LInux containers.

Swapping Out Operating Systems [08:42]

First I'm just going to show you, I'm using Rocky Linux here because Rocky Linux is a really great, really stable operating system. But, let's say for some reason, I wanted to swap out that operating system with something different.

This is the command in Apptainer to shell to open a new shell in a container. I'm going to grab that container from Docker Hub. Apptainer is perfectly happy to use OCI containers from Docker Hub or from any other OCI registry that you want to use. I'm going to get just the standard Alpine, the official Alpine container. I'm going to get a new shell. Great. Then once I've got that new shell, I can do the same command that I just showed you. You can see that I'm now in Alpine Linux instead of being on Rocky Linux. I just swapped out my operating system with a single command, basically. That is the essence of containers. And now, once again, I'm just giving you a look and feel of Apptainer right now, if I wanted to do the same thing, but I didn't want to bother with actually shelling into the container, I can execute individual commands like so with this exec keyword.

Now, instead of shelling into the container and doing a cat NCOs release Apptainer will enter the container, execute this command, and then pop back out and give me the result. That is a quick and easy way to do the same thing. Now that's neat, but most people are not really interested in just swapping out their operating system with a new one for no good reason. Most people want to use containers because they have applications which they don't have installed on the host system, or don't want to install on the host system for some reason, and they want to have quick and easy access to them. I'm going to give you a trivial example, but you can see how this can scale up to more difficult examples.

I'm going to run the Python command on my host system, and it's going to tell me that Python's not found on the host system. Now, this example is trivial because it would be very easy for me to install this command if I wanted to, but you can imagine, maybe I want to run some complicated artificial intelligence workflow. Maybe, I wanted to run something that has a lot of dependencies or a complicated build tool chain, which I do not have installed on my system. Or maybe it's some software that was written by a scientist in a lab that doesn't have any packaging behind it. If there is something like that, then I need to run and there happens to be a container available in a place like Docker Hub or some other registry. I can just go ahead and grab it and run it. Here is how. I'm going to get the default Python container from Docker Hub, shell into it with a single command. Now, I'm in the container, and now all of a sudden if I run Python, I have Python. Simple as that.

Now, I have to exit twice because once to get out of the Python environment and another time to get out of the container. I could do the same thing a little bit more easily with the exec command, and then I'd have to exit once. You have already seen that. I'm not going to share that with you again.

Pull Command With Apptainer [12:22]

That is a few little basics about containers. For the remainder of the demo, I'm going to see into this demo directory. I'm going to be showing you these same files over and over again. I've got a directory here called PEM, which I'll talk about later in these other two files. Let me show you something that's a little bit different than what you might have seen, if you're familiar with other container run times. Apptainer has this pull command. I'm going to go ahead and pull once again that Alpine container from Docker Hub, but when I pull it, I actually end up with this new file. It might be a little bit different from what you're used to seeing with other container run times. This new file is a SIF file, a singularity image format file. Now, that file is the container, it encapsulates the container. I can do things like shell and shell into that file or use that file as an argument to Shell. Now, I am in Alpine using that SIF file. Now, I can move if I decide that I don't like the name of that file, for instance, I can move it to another name. I'm going to go ahead and do that, and it still works.

But I have to give it the correct name. This is great because now if you can manage files, which I hope you can do, you can manage your containers. I could copy this container up to an S3 bucket. I could put it on a website and have people download it. I could scp it to another system. I could email it to somebody since it's so small. This just takes that layer of having to rely on some program to manage your containers away and lets you manage your containers by yourself. That is just the tip of the iceberg of what SIF allows you to do. I'm going to go through some of the other use cases of SIF towards the end of this talk. Now that I have covered the real basic look and feel of Apptainer, I want to start talking about those three things that I mentioned earlier in the talk that differentiate Apptainer from other container run times in this space.

Apptainer's Simple And Sensible Privilege Escalation Model [15:08]

The first of those is it has a simple and sensible privilege escalation model. I want to start getting into that a little bit.

Let's look and see some details about who I am on this system. I am Apptaineruser, that is my username on this system. And my UID is 1 2 34 GID is the same, and I belong to the group's Apptainer user and wheel. I'm going to go ahead and shell into that Alpine SIF container again. I am going to repeat and run ID again. This is a major difference between Apptainer and other container run times. I'm the same user inside the container with the same UID, same GID, and I belong to the same groups. Let's look and see if I can elevate my privileges. I'm going to try to run a sudo command. I'm going to do sudo who am I. It doesn't have sudos installed inside of it, but that's fine. I'm a little bit tricky. And I have a container up on Docker Hub in which I've installed the sudo program. Let me go ahead and run that instead.

I'm going to jump into this container, and now I'm going to go ahead and try to elevate my privileges. Now, you might have anticipated that that wasn't going to work because I already told you that whatever privileges you have on the host system, those are the same privileges that you have inside the container. You can't elevate privileges once you enter the container. But what's cool about this is that it doesn't give you the standard "this incident will be reported" error message. It gives you something different. It says, Hey, the no new privileges flag is set,  which prevents sudo from running his root. If you're running in a container, you might need to adjust the container configuration to disable the flag. What is happening under the hood here is Apptainer starts a process which is responsible for spawning your container.And that process, when it starts that process, it starts it with the no new privs flag, which is a kernel flag, which allows the kernel to say, I'm not going to allow you to elevate your privileges in this process or any of the trial processes.

Another guardrail that Apptainer has in place, one of the most basic things about containers, is when you enter a container, you have a new file system, you have a new root file system. The way that that works is that a new file system is mounted to your host file system, and then the mount namespace, which we'll talk about more later, you pivot into a new mount namespace, which allows the kernel to present that file system to you as though it were the root file system. When Apptainer does that operation, it passes the NOSUID option to the mount command which prevents you from being able to use SUID programs like sudo within the container. The point of me telling you all this is that it's not like there's some complicated security privilege model, which is tacked on on top of Apptainer to make everything secure. Instead, Apptainer just uses the kernel and the file system features, which have existed in Linux forever, and which are really well known and dependent upon to make sure that you can't escalate your privileges inside the container. It is really simple and straightforward.

All right, so let me show you one of the interesting side effects of this whole thing: you are the same user inside and outside of the container. I'm going to go ahead and I'm going to print my working directory. This is a little surprising, and I'm going to talk more about this in a minute. For now, just sort of accept that even though I'm inside the container, I have the same directory as I had outside of the container. I have the same files too. Once again, that might be a little bit of a mystery right now, but I'm going to go ahead and,  touch, actually, I'm going to put some text into a file.

I just put the text bar into this file foo, I catted it. And notice that all these files, even though I'm inside the container, are owned by Apptainer user. That's me. I'm going to exit the container now. If I list the contents, you can see that I just created that file foo and it's owned by me. It's not owned by Root or nobody, or some other weird entity that I have to own and try to get this file back owned by me. This is a really convenient and nice side effect of having you be the same user inside the container as you are outside of the container. That is a really quick overview of what the privilege escalation model and what the UID model looks like with Apptainer.

Integration Over Isolation [20:46]

What about that thing that I just showed you that I said was a mystery? I showed you that I had the same directory inside the container as I have outside, and I was able to view the same files. Well this moves into the second aspect of Apptainer, which differentiates it from other container platforms in the space. The second aspect is intelligent integration with the host operating system or integration over isolation. Apptainer was designed to be used by scientists or others who want to run workflows. It is really basic. If you want to run a workflow, usually you want to do something with data. You usually want to read some data into the container or write some data out. It is an intelligent default to have specific directories bind mounted from the host operating system into the container by default and to make that the intelligent default and to make it so that the user has to opt out of that if they don't want that.

By default, the home directory is by mounted into the container at runtime. Current working directory, temp bar temp, and then some others like, proxis and dev, which allowed the container to do things like access devices on the host system and things like that. You can turn this behavior off if you want to. If you wanted to use the contain or the contain all options you could say, I don't want all those directories to be mind mounted into the container at runtime. Or you could use the bind option to buy mount different container, different directories into the container at runtime. That is all configurable, but it's just about having an intelligent default.

Zane Hamilton:

Can you edit the intelligent default?

Dave Godlove:

Yeah, you can also change the default. You can do so within the configuration files in Appainer or if you want to, you can set an environment variable, so that you always have the same directories by mounted. You could even add that to something like your dot Bash RC so that every time you jump into your environment and start a container, you always have the same files or same directories by mounted into the container. There are several other ways in which Apptainer intelligently integrates with the host system, and one of those is through namespaces. I glossed over namespaces a little bit earlier in the talk, and I want to talk about them in a little bit more detail.

Namespaces As A Feature Of The Linux Kernel [23:46]

Namespaces are a feature of the Linux kernel. Essentially all they are some resources and the Linux kernel can take and partition these resources out into different spaces into different namespaces, and then they can present those to the user in such a way that the user can only see whatever partition they are in. That is pretty much namespaces in a nutshell. There is a list of all the different resources that the kernel can partition out namespace this way. I already talked a little bit about the mount namespace, so that's how you can get a new file system. That's how you can see your file system as though it were the root file system. There's others like the network namespace, which allows you to partition off your network from the other network on the host system, the pit namespace, which can give you a new pit table and, you know, so on and so forth.

By default, most container run times enter most or all of these namespaces because they're trying to achieve maximum isolation. This was one of the great things about Apptainer though, is that the users that Apptainer is trying to hit are, are scientific users who want to use Apptainer to run workflows. Because of that, they might want to do things like use the network on the host system without providing some sort of specific configuration or see the processes that are running inside their container on the host system. The only namespace that Apptainer will enter by default is just the mount namespace, and all the rest of them continue to be shared with the host system, unless you specify otherwise, unless you specifically say otherwise. Once again, this is all configurable. Let me just show you a quick little demonstration of why that's useful to a scientific user. For instance, all right. Many scientists use Python to program in, and many of them use Jupyter Notebooks. I'm going to go ahead and start a container,  from Docker hub that uses Jupyter,

I'm in the container. Then if I want to actually run that notebook the command to do so is in opt. I'm going to start the notebook. I'm going to start it running on port 8888. Notice I didn't do any port mapping from the container to the host system. I just entered the container just as is. I am going to specify no browser because this container doesn't have a browser in it, so I can't connect to it with a browser inside the container because this container doesn't have a browser.

If I do that the server starts, the Jupyter Notebook server starts,  I can go ahead and just open this link. Now. I'm on the host system. I'm outside of the container, and I'm accessing Jupyter Notebook through the container through that 8888 port. I didn't have to configure any port mapping or anything like this. If you're a scientific user who just wants your Jupyter notebook or whatever your application is to work and just wants to be able to connect to it, this is really easy and this is really helpful. You can just use a container that has this in it, and you don't have to provide any special configuration. If you are a developer creating microservices, then this is actually the opposite of what you want. The good news is this is still all configurable, so you can still set up a network namespace, you can still set your port mapping up and, and do everything that you need to do. It's just once again, about having intelligent defaults to help the types of users who are normally going to use Appainter.

I'm just showing you this little Python script here. Once again, I'm in that same directory that I've been in this whole time. This is the actual  container that I pulled earlier. This is that file that I just created, and then I showed you this little Python script that I have. It is not really a script, it's just a single command. It just says this is executed by Python inside the container. I'm going to be talking more about that later. I just wanted to kind of tease that. I've got that script here.

While we're talking about intelligent integration with the host system before we get away from that, I want to go back to my namespace manual page. I want to talk about one very special namespace that is the username space. The username space is special because it allows the kernel to map a UID on the host to a different UID within a namespace. That is kind of an interesting and a different thing that you can do. Once again, Apptainer does not enter the username space by default, but if you wanted to enter the username space, you could do so like this.

However, it turns out that most people who want to use the username space want to use it for one specific reason, and that is to make themselves root inside the username space. You want to map whatever UID you have to the zero, and that's usually what you want to do. Apptainer gives you a little shortcut that allows you to do that, the fake root option. If I run that, now my UID is zero. I'm root inside the container. I wanted to show you how this works too, because this is a little tricky actually. Okay, so now, if I print my current working directory, this is all a little bit of a fiction, and I wanted to share this to you because it could be confusing at first. If I print my current working directory supposedly I am in roots home directory. But if I list the files, ooh, that's fishy. I have that demo directory that I showed you earlier. Now, magically in roots home directory I CD into that, I can see that same group of files that I kept showing you over and over again. Is this really a roots directory? But notice that they're all owned by root. Now everything's owned by root. I'm going to make another file. I'm going to call it made by root.

I'm going to show you that that actually worked. It's owned by root. Now I'm going to exit and there's my made by root file, which is actually made by me Apptainer user and that's all handled by the kernel. That's just username space handled by the kernel. That enables you to do things inside the container, which would otherwise be impossible, but it also prevents you from editing files on the host system, which you shouldn't be able to edit.

Okay, good. Now I want to talk about this is good. I'm going through this fairly quickly, and so I hope to have some time for questions towards the end.

Zane Hamilton:

We do have one, but I think you're going to cover it anyway, so I haven't had it brought up yet.

Dave Godlove:

Well, what is the question? I want to make sure I'm going to cover it.

Zane Hamilton:

It's actually asking about Checkpoint Restore. Is that supported in Apptainer?

Dave Godlove:

Oh, I'm not going to cover that.

Zane Hamilton:

Actually. Okay, well then I'm glad we brought it up now.

Apptainer Future Education Series [31:54]

Dave Godlove:

That is not part of something that I was going to cover in this demo. I don't know if I'm allowed to say it. So you might have to redact this from the live video. Yeah. But is this on a delay? I think that there's a rumor that we're going to start pretty soon with an Apptainer education series. And if we do that, which is just a rumor at this point in time, I think we're going to be delving a little bit more deeply into specific features like that. It may be that the Checkpoint and Restart workflow, which was recently added to Apptainer, there's a rumor that maybe that might be one of the first topics that we go through in detail. But for now, I mean, you didn't hear that from me. I'm sure you better get this information out there before it's deleted off the internet.

Zane Hamilton:

That can neither be confirmed, nor denied.

Dave Godlove:

But for now. I will say that functionality has recently been added to Apptainer. There is a checkpoint in restarting workflow that can be carried out in Apptainer. And it's not part of my demo to go through today. It is pretty cool stuff.

Apptainer And Run Times [33:09]

Okay. Is there anything else before we jump into the final aspect of Apptainer, which differentiates it from others?

Zane Hamilton:

No, not yet.

Dave Godlove:

Okay. I mean, this isn't my opinion. The final, the biggest thing that differentiates Apptainer from other container run times, and that is SIF, the singularity image format. All right, so to start this part of the demo off, I want to start by building a container. I've been showing you these files over and over again, but I'm not actually showing you what they do and anything about them. Let's look a little bit at some of these. I showed you the pyscript.py file earlier, let's look at this Python definition file.

A definition file is to singularity what a docker file is to docker or podman or whatever, I guess build up technically. This is the way in which you instruct Apptainer or Singularity to build your container. In this Apptainer definition file, I am using a bootstrap agent called library. A library is a registry that stores SIF files natively, basically, instead of storing OCI. This is like the SIF version of the docker hub. You can have different libraries. I have one preconfigured, which I'm going to be using here. You can point to different libraries. I'm going to be pulling this container called godlove/secure/ubuntu-bionic. That is an old version of Ubuntu. I'm going to be using this fingerprints, keyword that I'm just going to ignore for now, and I'll tell you more about that later.

Just forget about that for now. The post section is a scriptlet that once you download that container, the build system is going to pop into the container and it's going to execute whatever you have here. I'm just going to do an update, and then I'm going to install Python. Then the run script, this is that piece of metadata that tells the container what to do when you run it. I didn't run a container earlier but I'm going to talk a little bit more about that here in a minute. We'll see what that looks like. Basically, whenever you run, which is a command in Apptainer, whenever you run the container, this is what it does. In this case, it's just going to call Python. It's just going to hand Python with this funny little syntax whatever the command line is that I give to Apptainer it's going to go ahead and hand that right onto Python. Let's see what that looks like. Let's go ahead and build it. First I'm going to do Apptainer build python.SIF from python.def. It says, whoa, you can't do that, you're not root. So, I could become root and do that, but that's not a great thing to do for various reasons. Instead, I'm going to use this nice little,  fake root option that I showed you earlier to pivot into the container,  and become root in a new username space inside the container.

Zane Hamilton:

We do have one question, if you don't mind while that is downloading.  Fakeroot cover, user userspace and secure host route namespace?

Dave Godlove:

Not sure if I understand the question. What fake root does is it leverages the username space, which is a kernel feature, which allows the kernel to map the UID and GID on the host system to a new one in the new namespace. Because the kernel is involved here, the kernel is able to keep straight what the context is. If you try to access files even if you're in the user namespace, which are really owned by root on the host system, you're still not able to access them. You can only access files which exist within the context of the container and not on the host system. Does that answer the question? It still does secure the host. It doesn't allow you to access the root file system on the host, but it allows you to access the root file system within the container and sort of pretend to be root, essentially.

Zane Hamilton:

I think it does, Dave, if it doesn't, if you want to add more context and ask it a little bit deeper maybe, but if that answers your question, great. Thank you for the question. Thanks Dave.

Dave Godlove:

Yep. Cool. Now I've built this container, python.SIF and so I can do it. I didn't show you this before, but I can do this run command Apptainer run, and if I run it, it's going to do whatever was there in the run script. What was there in the run script was to run Python, basically,

But it's cumbersome to use that entire command. If you've been paying close attention, you might have seen that these SIF files are highlighted green, and that's because they have the executable bit set so you can actually execute these guys. This is one of the cool things that you can do with SIF. If I just run it like so it's going to give me Python, and I can pretend now that this SIF file is just an executable sitting on my host system like any other executable because I gave it, remember that awkward syntax,  to just pass the command line from Apptainer directly into it. I can also do things like that and they work, which is pretty cool.

Furthermore, I can give it the script that I teased a little bit earlier. I can just pass the script directly to that invocation of my Python SIF, and it's going to go ahead and work just as though it should. That is a very powerful thing that you can do just by encapsulating the container in a single file. You can see how you could leverage this to install programs,  on behalf of users, for instance, and then let your users run the containerized programs as if they were just programs right on the host system. and that is something I think that a lot of people actually do.

Zane Hamilton:

That's very cool.

Dave Godlove:

Okay, now there's two more things I want to cover, but is there any, is there any questions about that before I go further on?

Zane Hamilton:

Nope, continue on.

How Do You Trust Containers Downloaded From Untrusted Sources [41:02]

Dave Godlove:

Okay, two more things I want to cover, which make this the SIF format just really great. There are two questions that we're going to answer,  with the SIF format. The first of those is how do you trust containers and how do you run containers that you've downloaded from potentially untrusted sources? Then the second question is, how do you run containers in untrusted environments? The first question, how do you trust containers that you've downloaded from untrusted sources? You do that by cryptographically signing and verifying your containers. I'm going to show you a little workflow to get you acclimated to that. Instead of explaining this one first, I'm just going to throw it at you and hopefully you can see what I'm doing and get an idea of what's going on here. I'm going to create a new key with Apptainer, this is part of Apptainer, enter my name, my email address, my real CIQ email address if you want to talk about Apptainer or whatever. Let's go for it. This is some demo key interrupt passphrase. I just generated a key pair and as a convenience that workflow is part of Apptainer you can see. It is one of the commands within Apptainer. I'm going to go ahead and list my keys.

I actually have two keys on this system. There's one I created a while ago when I was first putting this together, and then the one that I just created a second ago. I'm going to go ahead and sign, let's say the Python container that I just created, really in actuality, if this was production, I wouldn't sign this container because, well, actually this one would be okay to sign because I started this container from something I trust. I'll show you why later. I'm going to go ahead and sign this Python container. It's going to ask me, which key do you want to use? I'm going to tell it the one that I just used, that I just created. It's going to ask me for the passphrase. Now it's signed. Now, if I want to verify that the container has actually been signed, I can do so for the verification command, and that's going to use the public key material,  along with the container itself to tell me basically that this container is a bit for bit reproduction of the original one that was signed by the entity that had this fingerprint and that fingerprint is very important.

That fingerprint is really the way in which, you know, that this container was signed not just by anybody, but by the same entity that originally signed it. The one that you are interested in knowing about. Okay. Who cares? I just signed and verified this container on this same system. Of course, it didn't change, of course it's the same. Well, why is this important? I used to work at the National Institutes of Health,  in the high performance computing center. One of the things that I did while I was there is I created a whole library of containers that I vouched for. I said, NIH staff can use these containers as starting points and can be sure that there's no malicious code in them. And NIH users can also use these containers and be sure there's no malicious code in them.

Then, I went and I stuck them up on a public library that NIH doesn't control. How can I be sure and how can anybody be sure when they downloaded those containers that they haven't been tampered with? But that library hasn't been hacked and somebody hasn't come in and replaced all those containers with their own containers that want to buy Bitcoin or do whatever it is that they want to do. Let me go ahead and download one of those containers. This is once again, it's in my secure collection. It is an Alpine container. I'm going to call this container secure_alpine. I'm downloading this container, and now I've got this container. How the heck do I know that this is an okay container to use, and the way that I know is I can verify it.

It says that this thing was signed by David Godlove using a production key a million years ago. This is my Gmail address. This fingerprint is very, very important. I happen to know that this is the fingerprint that I originally used to sign this. Now, I know that this container, even though it's been sitting out in the wild somewhere and I downloaded it, and I don't know who's messed with it since then, I know that this container is a bit for bit exact reproduction of the original one that I made some odd years ago. Now, what else can I do? I can go ahead and re-sign this same container. It is going to ask me, which key do you want to use? I'm going to use the one that I just created again, Oops. I didn't enter the password correctly. Good to know that works. and now if I verify the container again, I've got two signatures on it. You can stack signatures one on top of another like this. This is really cool because now you can create a container and you can say, okay, my software development team signs off on this container. My QA team signs off, my security team signs off and I've signed off. I'm not going to run this container unless I have all four of those signatures or however many signatures I want. 

I told you earlier that I was going to gloss over this fingerprints command in the Python definition file that I made earlier. Well, that's what that fingerprints command is doing. You can put that fingerprints word in your header of your definition files. Now, what this will do is if it downloads this container and it finds that this container's been tampered with since it was first created, it's going to bomb out and it's not going to do any of the rest of this definition file. It's not going to use this container. You can totally just throw this container away and not use it if it's been tampered with. There's also configuration files on Apptainer that you can use to blacklist or whitelist, which whitelisting is probably more useful in this particular context. But you can basically say, I'm not going to run any containers on this entire system unless they've been signed by the following 10 entities. If I know that those fingerprints are all there, everybody's signed off, all the teams are good, then I'll run those containers. It's a pretty powerful thing you can do. And so that's how you answer the question of how do you trust containers downloaded from potentially untrusted places.

How To Run Containers In Untrusted Environments [48:27]

Now, how do you run containers in untrusted environments? And the answer to that question is you encrypt your containers so that nobody can look at them and see what's running on them. I'm going to show you that workflow really quickly.

For this, you actually do have to be root. This is something that we're working on changing right now, but for this currently, you still have to be root. Let me show you these files again. They are in this mysterious PEM directory that I've been showing you over and over again, but now I'm going to actually use it. What that has is a couple of RSA keys, a public and private RSA key, which are stored in the PEM format, and I'm going to use those. You can actually just use a password if you want, but it's not very secure. I'm going to use those RSA keys to encrypt and then decrypt to my container. I'm supplying a PEM path. I'm going to use the, let's see the public key material to encrypt my container, and then I'll use the private key material to decrypt it later. I'm going to call this enc.sif and I'm going to build it from the alpine.sif. That's another thing that's pretty cool that you can do. Since your containers are just files, you can actually build from the files which are already on your system. When you do that, you're essentially just converting the container from one format to another.

It's going to take a few seconds to do, even though it's a very small file system, because it has to actually encrypt the file system. Now we're encrypted, and so now I'm going to do Apptainer. Let's do enc.sif, I'm going to run true. And it's going to say, wow, you can't do that because that's encrypted. Okay, so let me supply the PEM path.

It's going to be the private key material because it's the secret. It's actually going to let me run it. I do that. It ran silently because the command that I ran was just true. Now I can just to make sure that that actually did what it was supposed to do, I can echo the environment variable, which tells me what the end code of the last process was, and it was zero. It worked out and did what it was supposed to do. That answers the question of how you can run a container in a potentially untrusted environment. It's worth noting that, so obviously this is just a file, so when you encrypt it the file system is encrypted sitting on disk. It's also obviously encrypted in transit because if you move it from one place to another, you're just moving a file from one place to another. There's no need to decrypt it. But, it's actually even the file itself is actually even still encrypted even when you run it, which is pretty cool. There's no need to decrypt it on disk and then, you know, pop into the file system or whatever where somebody now can go look on disk and see what the file system is. It's all decrypted in memory. That's a pretty cool feature.

This was just a really high level teaser look and feel. And what are some of the key features within Apptainer. Once again, there's a rumor that we might be going through some of the features of Apptainer in more depth and some of the future webinars, but I can't confirm or deny that. There are a few minutes left. If anybody has any more questions, I'll do my best to try to answer them.

Zane Hamilton:

We don't have any at the moment. I think we can wait a minute. Maybe I missed it when you said it, Dave, but you can have a container signed and encrypted, I think you said it. 

Dave Godlove:

Absolutely. Yeah. I've just encrypted this enc.sif, let me just go ahead and sign that as well. Now, I've got a signed and encrypted. Cool.

Zane Hamilton:

Very cool.

Encryption Versus Cryptographically Signing [53:09]

Dave Godlove:

It's worth noting too there's sometimes some confusion about encryption versus cryptographically signing. Sometimes people sort of conflate those two. I think that a good way to tell those two apart is to think about them as doing the opposite things. When you sign your container, you're signing it with your private secret. So both of these are asymmetric. When you sign your container, you're signing it with your private secret, which is private to you. Then, when you verify it, you use your public key material which anybody can have. That means that anybody can verify your container because all they need is the public key material to verify it. Encryption works the opposite direction. When you encrypt your container, you do that with the public key material. But when you decrypt it, you do it with the private, your secret that only you have because only you want to be able to, to decrypt the container afterwards. There are similar technologies, but they do basically the opposite things from each other. That's the way I keep it straight.

Zane Hamilton:

No, thank you very much. Thank you. I don't see that we have any more questions. I will look again? Nope.

Dave Godlove:

Real quick. If we don't have any more questions, I don't know if I've ever completed a demo without running this container. I just want to make sure I run it once. There we go. 

Zane Hamilton:

There you go.

Dave Godlove:

Perfect. 

Zane Hamilton:

It's great. Dave, I really appreciate you going on doing this. That's fantastic. Guys, we really appreciate you joining us today and we look forward to talking to you again at some point in the future if we have a series that is rumored to be or not to be. Really appreciate it guys. Go like and subscribe and we will see you again next week. Thank you.

Dave Godlove:

Thanks.