Power to the Users! Unprivileged Operations in Apptainer
From inception as the Singularity project, Apptainer has always been developed with the philosophy of giving users more power. Advances in the Linux kernel have allowed Apptainer developers to take this philosophy to the next level even permitting unprivileged and relocatable installations. In today’s webinar we will discuss the ways that Apptainer gives users the power to get their work done with as little friction as possible, we’ll demo some of the new advances in unprivileged operations, and we’ll discuss security implications of the unprivileged model.
Webinar Synopsis:
-
Apptainer’s Ability to Empower Users
-
Apptainer Versus Other Platforms
-
Apptainer Applications In Labs
-
Privileged And Unprivileged Users
-
Unprivileged Build Features With Fakeroot
-
Technical Difficulties During Demo
-
What The Script Does
-
The Future Of Apptainer
-
Apptainer Users Opportunities To Test Code
-
What Does Not Work With Unprivileged Installation
-
Apptainer And Running Rootless
-
Is Apptainer A Hardcoded Central Path
-
Using Shared Files With Apptainer
-
Will It Run On A Raspberry Pi Instance?
-
Apptainer Install Of Dependencies
-
Working Demo Of Apptainer Unprivileged
-
Examples Of Apptainer Unprivileged Being Useful In Current Projects
Speakers:
-
Zane Hamilton, Vice President of Sales Engineering, CIQ
-
Forrest Burt, HPC Systems Engineer, CIQ
-
Brian Phan, Solutions Architect, CIQ
-
Dave Dykstra, Computer Professional, Fermilab
-
Dave Godlove, Solutions Architect, CIQ
Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.
Full Webinar Transcript:
Zane Hamilton:
Good morning, good afternoon, good evening, wherever you are. Thank you for joining us for another CIQ webinar. My name is Zane Hamilton. I'm the Vice President of Solutions Engineering here at CIQ. At CIQ, we're focused on powering the next generation of software infrastructure, leveraging the capabilities of cloud, hyperscale and HPC. From research to the enterprise, our customers rely on us for the ultimate Rocky Linux, Warewulf, and Apptainer support escalation. We provide deep development capabilities and solutions, all delivered in the collaborative spirit of open source. This week we're going to be talking more about Apptainer, continuing that Apptainer discussion and containerization that we've been talking about over the last several months. But this it's going to be a little bit different because there's some new exciting things that are taking place in the community. We're going to talk about power to the users, the unprivileged operations, Apptainer. We can go ahead and bring in our panel. Hello everyone. It's good to see everybody again. I'll go around and do a quick round of introductions. Forrest, you're up first cause you're right next to me today.
Forrest Burt:
Hey everyone. Good morning everyone. It is great to be on the webinar. My name is Forrest Burt. I'm an HPC systems engineer here at CIQ. I am very happy to be on the webinar and excited to see the presentation that we have here today.
Zane Hamilton:
Thank you Forrest. Brian.
Brian Phan:
Hi, everyone. Brian Phan here. I'm a solutions architect here at CIQ. My background is in HPC administration and architecture, and some workflow expertise in the areas of CFD and genomics. Great to be back on the webinar.
Zane Hamilton:
Thank you, Brian. Dr. Dave, welcome. It's good to see you.
Dave Dykstra:
Hello. I'm not a CIQ person. I work at Fermilab, Fermi National Accelerator Laboratory,
Zane Hamilton:
You were involved with Singularity Apptainer, and I think it's important to tell people how.
Dave Dykstra:
Yeah. I'm on the technical steering committee and also did a lot of the development of some of the new features that we're going to be talking about today.
Zane Hamilton:
That's fantastic. Thank you Dave. And our other Dave. We always have multiple Daves and as you know, if you've watched any of these in the past. We're going to the last Dave here, Dr. Godlove.
Dave Godlove:
Yeah. I'm Dave Godlove. I am a solutions architect here at CIQ or at CIQ CIQ is not here, but I'm a solutions architect at CIQ. I have a long history with Singularity Apptainer as well. I'm also really happy to be demoing some of the features that, primarily, Dr. Dave next to me has worked really hard on, championed, and developed.
Zane Hamilton:
Thank you. I'm going to start off with you Mr. Godlove go ahead and then go back. For those of you who are maybe new to Singularity Apptainer, let's talk about how Apptainer has really been giving users more power to do things over the past seven years.
Apptainer's Ability To Empower Users [8:21]
Dave Godlove:
Cool. Yeah, this is a topic that I really, really like talking about because it's something that I've been personally involved in quite a bit. It's really the legacy of Singularity and Apptainer as a project. I would say that Apptainer really was started with the idea of giving more power to users. If you think about just the technology itself, the technology allows a user on a high performance computing system, basically to install anything they want. Thank you, Rose. Basically to install anything they want on an HPC system, basically to install it in a container and then grab it and pull it over to the HPC system and run anything they want. This is great because not only can they just install anything they want, I mean, I guess maybe they would've been able to do so anyway.
It allows you to install things using tools like YUM and DNF and APT-get and just package managers. A lot of times when there's new scientific software that you want to run, it'll be hosted on GitHub or something, you'll go to GitHub and there will be some instructions for installing it. The first instructions will be install all these dependencies and it'll have a list of APT-get or YUM commands, assuming that you're on some particular distribution. Then it'll say I don't know, PIP install after you install all these dependencies, the software or something like that. You'd be, oh, well that's great, but I can't do any of that because I'm not in a system administrator in my HPC environment. It used to be that you would have to contact a system administrator and say, do we have any of these dependencies?
How can I talk to you about maybe trying to install them in a non-standard location? Now you just say, oh, I'll just spin up Apptainer and it doesn't matter that we're running RHEL on the cluster. I'll just spin up an Ubuntu container and do all this app to get stuff that you want me to do, install all these dependencies, pull in the program, drag it over to the system, and boom, I'm done. It's all about giving users power and handing them the ability to do stuff. I'll talk a little bit more too. I'm excited about this, so I'm going to talk a bunch, but I'll try to cut it short. I will say that there's been lots of times in Apptainer's development too, where the developers have gotten together and said, we could let users do this, but we're not sure if we want to, because users might end up shooting themselves in the foot.
Those discussions almost invariably have come down to let's go ahead and give the users the power to do whatever feature it is that we're thinking about letting them do. Let's go ahead and trust the users and let's go ahead and give them the ability to do whatever. This extends to the ability, I'm going to call it, to be weird, to do things that are not necessarily best practices or what we expect or how we think users are going to use the software, but actually to get into the internals, into the guts of the container and change things in there. To do things, which allow their containers to work, even if they're not things that we necessarily expected users to do. This is really what Apptainer has always been about, is giving more power to users I think.
Zane Hamilton:
That's great. I know Forrest and Brian, you guys have come from outside environments where maybe you were using other containerized platforms. How has this given you guys a little bit more power or the ability to do things that you couldn't do with the other systems or other platforms, I should say? Forrest, let's start with you.
Apptainer Versus Other Platforms [12:04]
Forrest Burt:
As far as containers go, my background has primarily been mostly with the Singularity Apptainer Project. When I was working at my previous position as a system administrator in the academic sphere of HPC, I needed a solution that allowed me to install software in a more flexible way and in a more standardized way. I turned to looking into containers as a solution for that. I found out that some of the other containerization solutions out there were unsuitable for HPC. Singularity, at the point at which I started using it, was basically the de facto standard. While I've tinkered with other solutions around, my background was primarily with Singularity Apptainer, because I've mostly been in HPC, dealing with HPC focused containerization problems.
Zane Hamilton:
Thank you, Forrest. Brian?
Brian Phan:
In a previous role, Apptainer has enabled me to deliver software solutions very quickly to a customer. We had this inbound customer who needed to get a workflow running urgently. Usually, we would have to wait for the entire release cycle before we deploy our updates. Since they needed it really urgently, I was able to use Apptainer to quickly whip up a container for them. They got their machine learning workflow running the next day. They met their deadline and they were super happy with the solution I was able to deliver them.
Zane Hamilton:
Great. Thank you. Dr. Dave, I know you've been, I mean, for me now, you've been around labs for a while and you've seen a lot of different things take place over time and moving towards that containerization, what has this allowed you to do?
Apptainer Applications In Labs [13:49]
Dave Dykstra:
I'm the support person for the high energy physics community. We do some HPC, but we also do what we call high throughput computing. We run on, instead of big computers, just lots and lots of them, hundreds of thousands of computers all over. Our community got involved in using Singularity early on because we wanted to be able to start jobs at all the sites first from the experiment that's running. To allocate a space first, they submit a job. Then it goes and runs the user's computer. The user's code. We wanted one unprivileged user to start a container from another unprivileged user, and to keep them isolated from each other.
They could do it even on the same machine having it running from different users. We were very interested in the isolation features and we were very much interested in unprivileged. We've been using unprivileged for quite a few years because our codes distribution is not through SIF files. We have the CernVM File System distribute already unpacked. Then we could use the unprivileged features earlier as they were available in Singularity and didn't need the mounting of unprivileged SIF files. That's my background.
Zane Hamilton:
That's great. Thank you, Dave. We talk about unfair users. I know that's something we really want to talk about, but I think there's been a lot that has come out in the last at least two releases, which has really been driving toward what's going on now within the Apptainer community. Let's talk about what that means and where we're adding features and what those are. I'll leave that open. Anybody want to take it? I know both Daves probably have a lot to say on this topic. Mr. Godlove, I'll let you go first.
Privileged and Unprivileged Users [15:57]
Dave Godlove:
Maybe it'll be good to recapitulate some of the history, and some of the need for privilege and so on. Just to level set a little bit to answer this question and to get everybody. For a very long time now, Apptainer has required some elevated privilege to run. The way in which that works is that your administrator will install Apptainer, which will have what's called a SetUID bit installed with it. It used to be, I think, correct me if I'm wrong, but I think a really long time ago, the entire executable had a SetUID bit.
What that means is it sets the user identification to somebody else. When you have a binary, which is owned by somebody else, you can run as that user, even if you're not that user. That becomes handy because you can own the binary by root, and then you can allow an unprivileged user to run it, and they run as though they were root. Well, I think a very, very long time ago, I think, I could be wrong, but I think that the entire binary was SetUID pretty quickly into that development. We changed that. There was one binary, which was SetUID and then Apptainer would go through and elevate privileges only for things that needed to be elevated during the startup of the container. So what do you need elevated privileges for?
Well, for instance, you need to mount a file system to your file system. That's usually a privileged operation. You can't really just do that on your own. We can't just give everybody who we want to use to obtain our root privileges. So we allow in that one specific circumstance, very carefully, we elevate privileges just long enough to mount the file system and we drop privileges. SetUID bit binaries are notoriously difficult to write properly. We've always had very, very smart people working on the singularity Apptainer project and we've had very security minded people working on it. Because of that, we've been very careful and we've had a pretty good track record as far as making sure, not a perfect track record, but a pretty good track record as far as making sure that we don't have vulnerabilities in this system, but the fact remains it's a really difficult system to administer.
A long comes username spaces in the Linux kernel several years ago. Linux username spaces within Linux have actually been around for a while. What the username space allows you to do is the kernel allows you to enter a new walled off area, a new namespace, and it maps your UID on the host system to a different UID inside that name space. So we can use that to map whatever UID you are to zero inside the username space. That gives you the ability to have privilege inside that name space even though you're not privileged on the host system. Several years ago, we started to play around with using this to give you the ability to be root inside the container.
Then the username space started to become adopted more and more. It started to become more standard in Linux kernels to the point where now it comes standard and it's on by default. Now it's quite easy to use that to be root inside the container. This also allows you to be able to build things inside the container that would normally require root privileges. I think the clincher that now allows us to be able to do everything that we want to do, even to be able to mount. Dave, you can jump in and correct me if I get this wrong, but there's now new technology using FUSE, which is an unprivileged implementation of, it's like a general implementation of lots of different file systems. Using FUSE, there's another, there's a squashfuse now file system, which allows an unprivileged user to take a SquashFS file system and mount that onto your file system. That's cool because guess what Apptainer containers are saved as SquashFS files. Now we can leverage that technology to mount the file system to your file system unprivileged. It pretty much removes about 95% of the operations that you would normally want to carry out with Apptainer. That pretty much removes any need to have elevated privileges. It is where we came from to where we're at today.
Dave Dykstra:
I'll add that. It is the most important thing. Also, at the same time, we added the ability to mount FUSE to fxt3 and EXT2/3/4 file systems, which is used for overlays. Sometimes, we use those for overlays, and we want to make it appear writeable. Also, FUSE OverlayFS when we want to have different layers. It will use that or it will use even the newer kernels; you can actually run the kernel OverlayFS unprivileged inside a username space. That's an even more recent feature than the mounting of FUSE. So those things. Then of course, the unprivileged build features with fakeroot that were added in this release, which you were probably going to talk some more about.
Zane Hamilton:
Yeah, I was just about to ask that, David, I mean, if you want to continue talking about that and to tell us why that is important. How does that help a user, an end user?
Unprivileged Build Features With Fakeroot [22:17]
Dave Dykstra:
Dave did mention that we could do some building even before the recent Apptainer releases. It was possible to do building. I'll first back up a little bit. When you're building a new container from scratch using a definition file, you're doing these privileged operations of apt-get or YUM. It requires it to appear privileged inside. Not only that, it really normally wants to have multiple user IDs. There's root files, but it also has other things like /bin or other per package user ID. It wants to have multiple user IDs and the unprivileged user namespace is by itself, the totally unprivileged really only gives you one user ID you're allowed to map. We have had for quite some time this --fakeroot option in Singularity and Apptainer.
That would allow you to do a build as fakeroot. It still requires a system administrator intervention because it requires setting up this mapping. It's not a totally unprivileged operation. It requires setting up the system administrator to set up something called slash UTC, sub UID, and sub gid to say which user IDs you are allowed to map into in your namespace. That actually uses a privileged escalation program to help with the new group, new UID map and new gid map. That's what podman and other rootless container systems are using. Singularity had support for that. You had to add the option when you did the build to say --fakeroot and you also had to make sure the system administrator set it up.
What's new in Apptainer 1.1 is that if those are not set up, it'll just use it by default. You don't even have to say --fakeroot when you say do a build. But if it's not set up, it instead runs a fakeroot command. It is different from the fakeroot option as a fakeroot command, which makes it appear as if you have multiple user IDs, but you really don't. It traps the system calls. When something tries to do a chown to the user ID, it says, okay, I changed it even though it really didn't on the host. It also keeps track in memory which files it has changed if you will later ask, who's this file owned by? It remembers that other user ID. Through that we're able to do builds as well even without any system administrator set up at all. This can be very useful, especially on HPC systems. People want to use the locally optimized for that HPC system to be able to access to be able to compile right on that system is a really big help.
Zane Hamilton:
That's great. Thank you Dave. Mr. Godlove, I know earlier in the week we were on a call and you started talking about the ability to install Apptainer without having to have root. That was news to me. I think that's very cool. Why don't you tell me a little bit about that?
Technical Difficulties During Demo [25:37]
Dave Godlove:
Yeah. That is one of the new things that Dr. Dave over here has introduced. I'd actually just like to show that to you.
Zane Hamilton:
Absolutely.
Dave Godlove:
Okay. I'd like to share my screen. Okay. And I'm going to share, it helps if I speak while I do it.
Zane Hamilton:
Conscious train of thought.
Dave Godlove:
Yeah, sorry, I'm not seeing the window that I want to share. I should have tried to do this before. All right. Let me see here.
Zane Hamilton:
It's a live demo. It's what happens.
Dave Godlove:
Yeah. All right. Let's see here. Maybe we could talk a little bit about something else while I try to figure out why that's not working.
What The Script Does [26:34]
Dave Dykstra:
Alright, I'll talk a little bit about what the script does. It actually installs the program itself without having to compile it from source. You could always compile it from source, but this takes existing binaries that a system administrator would install. But instead of requiring a system administrator to install it and use YUM to install it just reads the packages from the servers where they come from and unpacks all the files into your own local directory, wherever you want them. Also, Apptainer has been updated to be able to run that way, to be able to detect where it's running from and then relocate itself that way. Wherever it is, as long as it's not in a SetUID installation, which is not allowed because that can be dangerous. The SetUID does installation, but if the SetUID starter isn't there, then it will just use them.
Zane Hamilton:
Dave, is this like Python virtual environments where you, at some point as Apptainer moves on, you could actually have a user have a specific version they want to be running and have multiple users on a system using different versions installed locally for them?
Dave Dykstra:
Well, they certainly can do that. I'm not sure exactly how Python virtual environments work, but yes they can. You can install however many different versions you want in your own directory there. Something else is being shared?
Zane Hamilton:
Nope. Not yet.
Dave Dykstra:
We're not seeing, not seeing anything being shared yet. Dave,
Zane Hamilton:
There we go.
Dave Dykstra:
There is something coming up.
Zane Hamilton:
Floating.
Dave Godlove:
It looks like it's just a gray screen. We're having a little technical difficulty right now. It might be because I have got multiple screens running. Let me see if I can fix that.
Zane Hamilton:
Sure. Let's talk to Brian and Forest, have you guys had a chance to play with this yet? From an unprivileged perspective?
Forrest Burt:
I haven't had a chance to play directly with the features that Dave is going to demo, but I've definitely been enjoying just being able to type, Apptainer build, *.SIF, *.def and having it build without fakeroot or anything like that. I'm definitely taking advantage of that type of capability.
Brian Phan:
Awesome. I have not been able to play with these features yet, but I am really looking forward to Dave's demo.
Dave Dykstra:
If he can get it to work. Me too.
Zane Hamilton:
If not, we're going to have him read it line by line as he goes, maybe.
Dave Godlove:
I'm really sorry guys. I have the option either to share Windows and it's not showing me the window that I want to share, or I have the option to share screens and it's just showing me black screens. I am having a hard time trying to figure this out. I apologize. Let me continue to work on it.
Zane Hamilton:
Drop out and come back . Maybe, you can do that too. While we're waiting, Dr. Dave, what other things are you currently working on in Apptainer? What's coming?
The Future Of Apptainer [29:59]
Dave Dykstra:
Yeah, the main thing that we're still trying to do is to support encrypted file systems on privileged. That's the one significant feature that still requires SetUID root. I didn't get around to that yet. We're trying to work with an existing FUSE file system that does encryption. We're getting someone else to work on that. I'm just consulting with them on it. That's the other main thing that's still missing without SetUID with the current release of unprivileged.
Zane Hamilton:
Very nice. I guess what the other questions I have for you, Dave, since we have some time with Apptainer now being in the Linux Foundation, have you found more people to help to test the right code? How has that been going?
Apptainer Users Opportunities To Test Code [31:05]
Dave Dykstra:
Well, that has still been an issue that we certainly could use more development help. The Linux Foundation doesn't really bring development resources. They bring legal help to help you do things and advice on how to keep things open. That type of thing. We could use some more help. We do have some more people now who have recently started to step up to do volunteer work for some more pieces. So that's good.
Zane Hamilton:
Well, that's great. I guess if we could go ahead and post the links to those communities along with this so we can advertise and start asking for some help. Are there specific areas that you need help in? Is testing always going to be one that seems to be a place where people could contribute? What else do you need?
Dave Dykstra:
It's really whatever features that people find important. What they would be interested in. Certainly, the unprivileged was my big thing. It is why I did a big push. I have dedicated a lot of time paid by my employer because I felt that this was a really important feature. I don't expect to be quite so active anymore. Because, I do have like four other jobs I'm also trying to do.
Zane Hamilton:
Of course.
Dave Dykstra:
Whatever people feel is important for their use cases, please jump in and help.
Zane Hamilton:
I think going back and looking at the, oh, somebody's got a question. Good. Is there anything that does not work with an unprivileged installation?
What Does Not Work With Unprivileged Installation [32:54]
Dave Dykstra:
Yeah, and that's the decrypting of the file systems that's the one that Rose does require. There are a few other minor things that were hardly ever used. Things like going with different user ID inside. I think that might be available soon. I forgot how that ended up, but mostly it's the decrypting of file systems on the fly.
Zane Hamilton:
Thank you Dave. Thank you Sylvie.
Dave Dykstra:
It's a slight difference if you start up a SIF file and could look in slash l slash slash you'll see a different set of user IDs. When Apptainer runs with SetUID mode, then you see all the user IDs. They get mapped straight through. You would see something owned by root, when you're running an unprivileged mode, it is owned by you. The files that were owned by root look like they're owned by you inside. A lot of other files are just owned by nobody because you have two IDs. There's you and there's nobody. That's what you get with the unprivileged username spaces. Unless you're using fakeroot. If you're using fakeroot, then you get them all again. That's a difference. In general, it doesn't make much difference. Sometimes programs are very picky and then it does make a difference. That's another thing, but in general, no one hardly ever notices that.
Zane Hamilton:
Thank you. Dave, How's it going?
Dave Godlove:
Still not great. I'm going to try to move everything to my first screen and then just share the entire screen and see if I can do that.
Zane Hamilton:
Okay. It's one of the other questions that I have that maybe people are wondering about. If you are used to some of the other container platforms, I know we've talked about podman being able to run rootless to some degree. Why is that important for systems administrators and really for HPC, why is it important to not have to have root other than you don't have to go get somebody to install something, why else is that an important thing?
Apptainer And Running Rootless [35:28]
Dave Dykstra:
Right. Basically we're talking about security considerations here. I noticed that it was in the abstract about today's talk. We haven't got to it yet, so I'll talk about it while Dave is busy. As Dave said, SetUID root is notoriously difficult to secure. It tried to do a good job of it. There is in fact one known serious risk that's always been the case with Singularity SetUID, Apptainer SetUID mode, which is allowing the user right access to the container. So, basically a file system.
It's a file system inside it. This gets mounted directly in the kernel. The kernel developers have said this is a risk because allowing users right access to the underlying bits while the kernel is interpreting it is a risk. The kernels not designed to protect against that case. Let's say the user is twiddling with those bits and then right in the middle, at the same time, the kernel is interpreting them. They say the kernel file system drivers are not designed to support that. So, it's always been a risk. No one has published a known attack using this, but it's always been a risk that we've lived with because it's been so convenient to use.
When you run with unprivileged and the file systems are not running in the kernel at all, it's not the file system. This is the reason why the kernel owners have only allowed FUSE file systems. They're worried that any other file system, which you allow to be mounted inside this unprivileged user or namespace where the user is running as a pseudo root will have access to the underlying bits is too much of a risk. They don't allow it. That's why they only allow a FUSE. It's the known risk that we've been living with all these years with Singularity and Apptainers SetUID.
So on the other hand, there are trade-offs. When you are using unprivileged username spaces, that exposes the system administrator to anytime the kernel comes out with a CVE something might be exploited through privileges and name spaces, which means they need to upgrade their kernel and reboot. It can often be a big pain for system administrators. It can also be a risk that maybe some of these things are running undiscovered. Maybe, people with a lot of resources have discovered these other holes and they haven't given it out to the kernel developers. There is that disadvantage. What we have put in our advice it turns out that almost every CVE for the last couple years related to being exploitable with unprivileged in namespaces have been in combination with network name spaces. It exposes, I guess, another whole big piece of the kernel code, the network handling.
We recommend turning off network name spaces if you can live with it. It means if you don't mind sharing the host network with all the containers, which is what Apptainer does by default it's not what Docker and podman do by default. They by default give you a separate network namespace. You can give an option to it to say use the host networking. Also, there are also a few system D services, just a couple in EL7 and more in EL8. They do use network name spaces by default, but we give instructions on how to turn those off so that these services aren't using a network namespace if you want to do that. It is the way that we in the high energy physics community, for the past couple of years have been running Singularity unprivileged not using SIF files. We didn't need squash views. We've been telling people to enable your unprivileged. Dave mentioned that it's enabled by default, but that's also in EL8 and later and also, and I think most of (inaudible) and it is enabled by default, but it's not in EL7 yet. What we say, those on EL7 enable your username spaces and everybody disable network name spaces if you can. That way you're going to be exposed to a lot fewer risks in the kernel.
Zane Hamilton:
Thank you for that, Dave. We do have some questions coming in, actually. If we can throw up Lev's question first. I saw back in November, I tried to reallocate and unpacked Apptainer RPM for some operations. It still looks for a hardcoded Central path.
Is Apptainer A Hardcoded Central Path [41:22]
Dave Dykstra:
114 changed that. If you were on the list, that was I think since November. I think we can look it up, but you probably weren't yet using 114. At the same time we enabled the install unprivileged on a safe script. It had more relocatability. If it's still any hard coded paths, please make a GitHub issue and we'll look at it.
Zane Hamilton:
Great. Thank you. From Peter, is there a problem using a shared file systems like Lustre with rootless podman? I guess there is an issue with podman, it's the same thing true with Apptainer.
Using Shared Files With Apptainer [42:04]
Dave Dykstra:
Not with the SIF files. See that's the difference. If you try to use sandbox mode on Lustre, you'll have the same issue. Podman basically is only sandbox mode. Podman has a feature in their latest, podman v4.2.0, they added a feature, which can read a SIF file. All it does is unpack into a sandbox before it starts. It's really not helping you. I guess if you unpack it into another file system, not Lustre. Then there's also the Lustre. The other issue with having unpacked files on Lustre and any network file system is that it becomes a big performance bottleneck. We didn't have that problem in the high throughput computing, high energy physics world because we used another file system, certainly on file system, which also did all the metadata operations, which are done on the client node and the same way.
But the problem with, when you'd use Lustre, a network file system like that, all the metadata operations have to go to a metadata server. When you start up your jobs on hundreds, thousands of nodes all at once, and everyone is reading thousands of files, your metadata server gets really bogged down, if not killed. right. Especially in a Python based container, which has tens of thousands of files, I think. It is another issue with that. So as long as you're using a SIF file on Lustre, it can work. Does that answer your question?
Zane Hamilton:
Absolutely I think so. Thank you, Dave.
Dave Dykstra:
In fact, I did some performance. Let me tell a little story. We were about to release 1.1.0 and somebody finally gave us a benchmark that we could run with SquashFUSE. It turned out to be super, super slow. This benchmark, and it's a Python based benchmark, and it was running on 64 cores and SquashFUSE was single-threaded. It kind of killed the performance. Fortunately, someone had already written a multi-threaded SquashFUSE patch. It turned out when we applied that on this benchmark, it turned out to be almost identical to the kernel SquashFS. In fact, measurements showed it to be slightly faster, although I thought it was within the margin of error.
In that particular test, we didn't compare it against unpacked sandbox on Lustre, but other people have done lots of measurements in the past. In fact, I think the whole reason why Singularity took off in popularity and became the de facto standard in the first place. System administrators found out that it actually performed better than raw files, raw software on Lustre is because the packed files, the single file system that gets mounted onto the client node moves all those metadata operations over, and so it was actually performing better. There's been a lot of studies about that in the past.
Zane Hamilton:
Yeah, that's fantastic.I see, we lost Mr. Godlove again. No, he's back.
Dave Dykstra:
Maybe.
Dave Godlove:
I'm trying, guys. I'm trying, the only thing that's letting me share now is tabs. Jonathan had a good idea. He suggested, why don't I just point the webcam at the screen.
Zane Hamilton:
I love pixelated visuals here. Let's see what happens.
Dave Godlove:
I don't know if it's going to work or not. Let me mute and see if I can get something that'll sort of work.
Dave Dykstra:
I could also try to share if you want.
Zane Hamilton:
And then it just breaks everything. It's just not the day today and you're on mute. This is like traveling with kids, Dave, you never know how it's going to go. Could be great. Could be a mess.
Dave Godlove:
Well, when it's me, it's pretty much guaranteed to be a mess, right?
Zane Hamilton:
Dave Godlove:
Where's my, there we go.
Zane Hamilton:
Hey, at least get to see Dave's set up.
Dave Dykstra:
Can we make Dave's screen bigger if we all go backstage except for Dave?
Zane Hamilton:
Yeah, we drop all of those and we make Dave big. That's good. Good call.
Dave Godlove:
Wow. All right. I don't know if we want to try to, can you guys see anything here?
Zane Hamilton:
There we go. I don't know if you can still hear us, Dave, but go for it.
Dave Godlove:
Yeah, I can hear you. Oh my gosh, that's, nah, you're not going to be able to see that.
Zane Hamilton:
We can, we can see that. That's awesome.
Demo Of Apptainer Unprivileged Installation [47:53]
Dave Godlove:
All right, well if I do, okay, I'm just going to show you really quick the unprivileged installation. If I can, if I do sudo whoami. You can see that I'm in trouble. This incident has been reported because I am not a Sudoer. Okay? I do have git. So, what I can do is I can go to cd/tmp/ and I'm going to do git clone https://github.com/apptainer/apptainer. I'm just going to copy the apptainer git repo directly to my computer here. Now if I cd into apptainer and I cd inside of apptainer/tools/ to this directory called Tools. I don't know if you can see that there, but if I cd there, and then I print what I got there's this script written by Dr. Dave called, install-unpriviledged.sh, which is really cool.
I'm going to go ahead and get help off of it. I don't have you to be able to see anything. The help is that you run the script and then you can change the distribution. You can change the arch. This is for the version of Apptainer that you can install. You can change the version of Apptainer and then you give it an install path. Just really quick. What I'm going to do here is, I'm going to say, I'm going to make a directory in which to install this. It's my home directory. I'm going to say /apptainer/1.1.5, which is the version I'm going to be installing. You can change this to whatever you want, but that's what I'm going to do here. I've created that directory, and then I'm going to call that script. I'm going to say I want to install version 1.1.5 and I'm going to give it that location that I just created as the place to install it.
Now, what's going on here is actually really cool, because what it's doing is downloading RPM packages and it's extracting those RPM packages. So normally when you're thinking about RPMs, you're probably thinking about installing things, using what, something like DNF or Yum, which are obviously privileged operations, which you can't do unprivileged. What Dr. Dave has done, and he can correct me if I get this incorrect, is that, I think that he has basically said, I'm going to go ahead and just download those RPMs from where they are up on their repositories, extract them, and manually install them in the directory of your choice as a non-privileged user, which is really cool because you're still using RPMs. It's funny that you think this is funny Zane. Anyhow, it's going ahead and it's extracting these things.
Now, I don't know why, but I've noticed that sometimes this takes a little bit longer and sometimes it goes a little bit shorter. This demo is really completely gone belly up today. I wouldn't be able to actually complete it in a reasonable amount of time. It's probably going to take forever, which, you know, is par for the course today. I haven't had a demo go this badly in a really long time, so I'm overdue. It's going to sit there and take its time. Usually this just takes a minute or two. But, this time maybe it's going to take a little bit longer. Maybe, if you guys want, we could talk about a few other things while this is completing.
Zane Hamilton:
We do have another question, Dave, from David. Another Dave. I thought I heard you say this code with it was coded in Python, if he heard that right? Will it run on a Raspberry Pi instance?
Will It Run On A Raspberry Pi Instance? [57:27]
Dave Dykstra:
It's not in Python. I talked about a container having Python. That's actually in Go the main program is Go, this is a bash script that's hanging right now. I would suggest that you just killed that and restart Dave. Maybe there's a problem with that server. It's written in Go. It should certainly run in a Raspberry Pi. Even this unprivileged install might work there depending on what operating system you're running on. What operating system are you on there, Dave? Is it a Fedora? You're mute.
Dave Godlove:
I believe I've lost my mouse. I've lost my mouse now, so it takes me a few seconds to unmute. I think my house is going to catch fire next. That's going to be like, no!
Dave Dykstra:
No!
Dave Godlove:
Only thing that can get worse, but, no, this is Rocky9 that I'm running.
Dave Dykstra:
It is on a Rocky9 system. Okay, so it was reading on Rocky9. The script actually is set up to install from EL7, or 8, or 9 onto any, actually, Fedora will take a native ones, but even if you're on Ubuntu or Debian, it's still going to take the corresponding Red Hat, EL or Fedora. Actually, it only does Red Hat, EL and it unpacks it there. I see David, you said in your note that you're running Rocky on Pi, so certainly it should just work. It will be ARM architecture, but that's supported there. Whatever architectures are supported by the upstream provider.
Just try it. In fact Dave showed you doing a git clone of the whole thing. You don't need that. All you need is that one script to install apptainer.sh, with the instructions and the apptainer.org, documentation apptainer.org/docs. The administrator and the administrator guide the install. It shows you the full path. You just run curl and download that one file and pipe it into Bash and that's all it takes. Unless there's a glitch in your networking, right, Dave?
Dave Godlove:
Right. Now we've got Apptainer running. I'm moving a bunch of cords.
Dave Dykstra:
Scroll back a little bit. We missed it. Can you scroll back a little bit? What did it do? Did it eventually finish?
Dave Godlove:
Yes it did.
Dave Dykstra:
Okay. If you didn't kill it, right?
Dave Godlove:
Cool. Okay. I exported the path apptainer 1.1.5 bin path to my path. Then I went ahead and just ran apptainer, like, so, and now we've got Apptainer running. In this demo today, I was going to go over building containers and a bunch of other stuff. I don't know how far I'll get. I've been moving everything around on my desk, and of course I've been pulling on cords and doing this and that. I've now lost my mouse and my keyboard, and I have to do everything on my laptop. That's an additional problem here that's going on. But that's cool.
Zane Hamilton:
I like the extended reach typing. It's making it better.
Dave Godlove:
Yeah. Yeah. So we can roll with it. I mean, whatever will do.
Apptainer Install Of Dependencies [56:28]
Dave Dykstra:
I mentioned the other thing that this install script does is install all the dependencies. That has been the motivation for the whole script is because there are now in 1.1, there's SquashFUSE, you need, there's, there's FUSE, OverlayFS, there's fakeroot, all these extra commands, which if you're out the sits administrator and they're not installed, they're unhandy to get. This script gets all those dependencies as well from an RPMs and installs all of those.
Working Demo Of Apptainer Unprivileged [57:00]
Dave Godlove:
I'm going to run the well-known hello world of Apptainer if I can. This is just going to prove to you that not only can you run help, but you can actually do stuff. I was going to say useful stuff, but of course, this particular container isn't useful at all except for just verifying that you've got a working installation of Apptainer.
Zane Hamilton:
Who built this container? Dave?
Dave Godlove:
Some jerk. I don't know.
Zane Hamilton:
It's been around for a while, hasn't it?
Dave Godlove:
It's been around for a long time.
Zane Hamilton:
It's awesome.
Dave Godlove:
Okay, this is built by me, so I'm not just calling some random person a jerk. I'm calling myself a jerk. So it works. Unprivileged installation works. I'm going to try my best to write a quick little definition file. I don't know if you guys want to riff a little bit more while I try to do that, but, I'm going to try to write a definition file and just show you that you can actually build unprivileged with an unprivileged installation of Apptainer, which is awesome, because this basically means that at this point in time, even if you're, you happen to be on some cluster where the administrators, for whatever reason, they haven't heard of Apptainer or they're suspicious of it, or it just don't want to mess around with containers. They haven't installed it, whatever, you're still not out of luck. You can install Apptainer yourself on your own, in your own space on a cluster where you don't have administrative access and actually even build your own containers on that place where you don't have administrative access. This is really, really awesome, as long as you have a computer that works, which is kinda, you know.
Dave Dykstra:
And as long as it has unprivileged or namespace enabled.
Dave Godlove:
Yes. And that's an important thing. Thanks for bringing that up Dave. If you're an administrator and you really, really do not want Apptainer running, don't fret, you can still turn off unprivileged namespaces and you can still lock things down appropriately.
Zane Hamilton:
While Dave does this, if you guys have any more questions, go ahead and shoot them this way. I'm going to go back to Forrest and Brian because they're being quiet again. Where can you guys see this being useful in things that you're working on right now? Start with Brian this time.
Examples Of Apptainer Unprivileged Being Useful In Current Projects [59:24]
Brian Phan:
This is useful in the sense that it helps me prototype, I guess software solutions pretty easily. I'm currently working with OpenPhone and currently trying to get that working with efa which is like an AWS product. Basically, this has enabled me to quickly rebuild OpenPhone so that it's compatible with that instance type. I basically can get it easily running on that. I'll pass it on to Forrest.
Forrest Burt:
Basically, just take this back to the user perspective. This is going to massively benefit high performance computing users across the sphere of high performance computing that are interacting with Apptainer and using it in their workflows. This really, just to bring it back to the theme of our webinar today, power to the Users. This gives them a huge amount more flexibility in how they're able to deploy and manage Apptainer within their own research, their own industry tasks that they're working on. This is just massively interesting from a user perspective. As someone who used to do a lot of direct user support type stuff in the sys admin role, I can see where this just gradually lessens the amount of work that I would have to do to manage this for users and just gives them ultimately, more of the capabilities that they were always looking for around reproducibility and the ability to integrate these within their workflows.
Dave Dykstra:
I want to say that actually what motivated me to write this script the most was I distributed a version or an installation of Apptainer in the CERNVM File System, which is accessible by anybody who's running CERNVM-FS. And by the way, should I put in a plug for, there's something called cernvmexec. This is another one of my tools. You can Google for it. You can run that as an unprivileged user. You can use CERNVM-FS as an unprivileged user. It also uses unprivileged user namespace. It's a CERNVM-FS is a FUSE file system, but you can run it all unprivileged. There's a command, you can run cernvmexec and then from there run Apptainer because it's all available inside the CERNVM-FS namespace.
I wanted to be able to support multiple architectures because I was starting to get requests for this on PowerPC. I want to be able to, okay, well I don't happen to have a PowerPC machine to compile on, but, oh, but those things have already been built for Apple. All I need to do is grab those binaries and install it. This script actually, you can install multiple architectures at the same path, the same directory, and it will run whichever ones you've installed based on whichever one your host architecture is when you run it.
Zane Hamilton:
That's very cool. Thank you. Are you ready Dr. Godlove?
Dave Godlove:
Yeah, as ready as I'll ever be. So, I created a very simple little def file and I'm bound to have made some mistakes in it because I'm trying to just recapitulate it from memory and I'm typing behind my back, but, I'm going to go ahead and, and press enter and see what happens here. If I can highlight the screen again. Here we go.
Zane Hamilton:
Make Dave big again. There we go.
Dave Godlove:
It's running kind of slow. In addition to all the other stuff that I'm having, I'm probably having some issues with my network, but it's, it's doing a DNF update and it's doing a DNF install of a bunch of stuff right now, and it's actually installing. What this tells you is that, whoa, I am root inside the container. Once again, just to bear in mind, I have no elevated privileges here. But I've been able to install Apptainer, run a container, and build a container with a bunch of DNF commands inside of it. All as an unprivileged user on this system, which is really, really groovy. I'm getting a question to show the DEF file.
Oh, Vim is not installed. I can't install it because I'm not going to, but vi is installed, so never fear. It's not like anything will go wrong. There we go. Very simple.
I've got Vim actually inside the container. I was going to play around with that and show that I didn't have Vim on the host system, but I had it in the container, but you get the idea. Installed Python3, Python3-pip, and then I installed a little program, a Python program that I like to play around with called asciinema. A lot of people are probably familiar with it by now, it's been around for a while. But basically it allows you to record ascii text sessions in your terminal as though they were movies, and then play them back as text, which is cool because you can put those things up on websites and things and then you can like pause them and just copy the commands out of them and run them at your terminal. But I was going to share that a little bit. We don't have any time right now. You can check it out on your own. Then if you wanted to run fakeroot inside the container that works too. I don't know if you guys want me to try that real quick or if we're over time already.
Zane Hamilton:
Yeah, one more thing, Dave. Try one more.
Dave Godlove:
All right. This is basically, I mean, I was going to go a lot further in this demo and show a lot more stuff, but this is basically the meetup, it, the three things I wanted to show. You can install it, you can build containers, and then let me jump in here as fakeroot really quick and just kind of show you. So apptainer, there's the command moving my hands so you can see it. And then I'm going to jump into test.sif. I'm in the container as root. If I do a, who am I? It shows me as root. My ID is zero. If I created files, they'd have a UID zero here. Now I can do things like DNF update. I can change things in user, local bin, user bin, or whatever.
Dave Dykstra:
Except this is read only though. You have to run it with an overlay if you want to make it writable.
Dave Godlove:
Right. In the longer demo I was just going to dump this out into a sandbox and then jump into it, install some more stuff using dnf. But I mean, we're over time. I think most of the users hopefully, or most of the viewers actually get the idea at this point. So, sorry that that was rocky. That was a bit of a rocky demo. A different type of Rocky demo. Hopefully, we were still able to do all the things that we wanted to do here. We still got it done.
Zane Hamilton:
Absolutely. If we need to, we can come back and try it again later. We are up on time guys. Dave, I really appreciate it. Thanks for fighting through and reorganizing your office. I know you can spend the rest of your afternoon trying to get it all put back together. Dr. Dave, it's always good to see you. Thank you very much for being here and for all the information you provide and all the work that you do. We really appreciate it. Thanks to the community for being there as well. For all the off-camera help today, we really appreciate the ideas and the comments and troubleshooting live. Brian, Forrest, always good to see you. Thank you for joining. Guys, go ahead and like, and subscribe and we will see you again next week. Thank you.