Continuous Integration and Delivery with Apptainer

December 8, 2022

In this week's webinar, we will be covering continuous integration and delivery with Apptainer, some use cases, and demo a CI pipeline building and pushing GROMACS to a registry using Bitbucket Pipelines.

GROMACS is a molecular dynamics (MD) suite commonly used in HPC that support research across a variety of domains, including biology, pharmaceutical discovery, materials science, and engineering. GROMACS can utilize GPUs, if available, to accelerate computation, and can also take advantage of MPI software stacks for parallelization across multiple discrete compute nodes at once.

Speakers:

Jonathon Anderson, HPC System Engineer Sr. CIQ
- Linkedin
Forrest Burt, High Performance Computing Systems Engineer, CIQ
- Linkedin
Brian Phan, Solutions Architect at CIQ
- Linkedin
Dave Godlove, Solutions Architect, CIQ
- Linkedin
Zane Hamilton, Vice President of Sales Engineering, CIQ
- Linkedin

Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.

Full Webinar Transcript:

Zane Hamilton:

Good morning, good afternoon, good evening, wherever you are. Welcome to another CIQ webinar. My name is Zane Hamilton. I'm the Vice President of Solutions Engineering here at CIQ. At CIQ, we're focused on powering the next generation of software infrastructure, leveraging the capabilities of cloud, hyperscale, and HPC. Our customers rely on us from research to the enterprise for the ultimate Rocky Linux, Warewulf, and Apptainer support escalation. We provide deep development capabilities and solutions, all delivered in the collaborative spirit of open source.

Today we will be talking about Apptainers, and we have a few people to bring on. We could bring in Brian, Dave, Godloving, and Jonathon. Forrest, as well, got the whole crew. Let's go around real quick. You guys introduce yourselves. You've all been on here several times, so I don't know if we need to do this, but let's do it anyway. Brian, I'm going to start with you.

Brian Phan:

Hi, everyone. Brian Phan here. Great to be back on the webinar. I'm a Solutions Architect at CIQ. My background is in HPC system administration and architecture, and I have experience with workflows in CFD and genomics.

Zane Hamilton:

Excellent. Thank you, Brian. Mr. Godlove

Dave Godlove:

Hi everybody. I'm Dave Godlove. I'm also a Solutions Architect here at CIQ. If you're new to the webinars or the Apptainer community, I've been around Singularity for quite some time. Been around the Singularity and obtained community for quite some time.

Zane Hamilton:

Thank you, Dave. Forrest.

Forrest Burt:

Good morning everyone. My name is Forrest Burt. I'm an HPC Systems Engineer here at CIQ. My background is in the academic and national lab HPC space, where I was a Cis-admin for a couple of years before joining CIQ. Happy to be on the webinar as well. Great to see everyone.

Zane Hamilton:

Thank you. Jonathon.

Jonathon Anderson:

I'll go ahead and go as I'm last. My name is Jonathon Anderson, and I'm a Solutions Architect with CIQ. Glad to be here. My background in HPC sys-admin. I'm eager to see the work that we're going to be showcasing today.

What is Apptainer [7:15]

Zane Hamilton:

Excellent. Thank you. Let's start and give an overview of what Apptainer is very quickly. Do you want to run through that real quick, Brian or Dave?

Dave Godlove:

I'd be happy if it's okay with everybody. Apptainer is everybody's favorite container solution, which is optimized for running workflows, jobs, specific applications, and things of that nature. It's been really heavily picked up within the high performance computing space. Apptainer is the successor to the Singularity Project, the Linux Foundation-like variant or the Linux Foundation-approved version of the Singularity Project. That, in a nutshell, is what it is. Apptainer integrates well with GPUs with other types of hardware. It's great once again for running jobs. It integrates very well with the host file system and with other things. It makes a lot of things that are usually difficult with containers a lot easier.

CICD Working With Apptainer [8:37]

Zane Hamilton:

Brian, you wanted to talk about CICD, and that's a term that we hear nowadays from everywhere within IT, not just HPC. It gets overused, and I don't think everybody knows exactly what it means. It may have come out of IT. Talk to me about CICD. Then tell me why I would like to do that with Apptainer.

Brian Phan:

Let's start by formally defining what CI is. Continuous integration is the automation of integrating code changes made by multiple contributors to a software project. In the context of Apptainer and HPC as the Solutions Architect team, and with Fuzzball, we work with a bunch of different software. We build many containers, and these containers need to be tested. Let me jump into maybe an example. My experience has been mostly in genomics. Let's say I am some software developer, I'm working on a new alignment, or I'm making changes to an alignment software. And these changes will be made for, let's say, performance purposes. Because this is a genomic, scientific piece of software, ideally, what continuous integration does, or what it should show, is that when I make these code changes, I should get consistent results.

If I'm aligning, once I make these code changes, the result file should be the same as the file created without the code changes. And this becomes important in the context of clinical analysis. Because it would be very bad if you run an analysis on a patient's genome and certain variants are called, and you make these code changes, now you're getting different variants. Someone has already made a diagnosis on the patient, and they've already given them medication. There are many real-world consequences that can happen with not getting your integration test and having consistent results. I want to pass the mic to the team and see your experience in the environments you've worked in with containers and testing them and whether you've liked automated this type of testing in the past.

Forrest Burt:

I can speak to this. When I was at my prior institution and starting to deploy Singularity at the time, containers were out for researchers. For the most part, I didn't do a lot of things like CICD with it. I was figuring out how the technology worked, much less trying to apply it to something like this. My workflow was to build out my DEF file on my workstation, build that container on my workstation, send it up to the cluster, and then do testing there. Even without, as you elaborated on Brian, there are a lot of external benefits to not automated testing. Even without automated testing, that was still a fairly effective workflow to get stuff out to users because of the external benefits that containerization gave me as a sysadmin and freed up my time to start looking into how I could do things like automation. This is the first work that I've taken a deep look into that you've been working on here, Brian, within this type of stuff with containers. Very excited to see you specifically with Apptainer. I'm very excited to see how this integrates.

Zane Hamilton:

Dave and John, anything you want to add?

Dave Godlove:

I used to be at the National Institutes of Health, and I used to create containers for applications that scientists would ask for. And how I would do that many other staff members would do is to create new containers when new versions of the applications came out. And then to do some manual testing if we had some example data and workflows we could use to do the testing. Then, push those containers up. But sometimes it's the case that you don't test everything, or you need to test more things or whatever.

A researcher would come back and say, Hey, this used to work, and now, then the latest version, it doesn't. And so that stuff would be great for those types of situations. I need to gain experience with CICD within the context of containers. It's funny because I have experience with CICD with Apptainer itself. The workflow you described is exactly the workflow that the developers of Apptainer have set up to test and push out the software itself as new versions come out. This workflow is widely used, and it's ubiquitous.

Zane Hamilton:

Brian, you had something you wanted to show us as well. I hadn't thought of having CICD for whatever reason. When I started looking at HPC and research, I didn't necessarily put it together, but I'm glad to see that it is something I should have included. Many of the conversations I have now are about people integrating that into the life of their cluster. It's not just about the people writing software but how they are getting the software onto the cluster and getting the researchers going faster. It's very exciting.

Brian Phan:

But before I show that, we can briefly talk about continuous delivery and defining that. In our context, we're working with Apptainer files, and these definition files are being built into containers. What continuous delivery means to me is that once these containers are built, they get pushed to a registry where our customers can consume them, or we can start leveraging them within our Fuzzball product. I wonder if the rest of the team has anything else to add.

Zane Hamilton:

I have a friend that worked at a very large hyperscaler, and she was running a large development team and said they were pushing via CICD process code to the hyperscaler about every 11 seconds. The speed and velocity you can get out of this process amaze me.

Jonathon Anderson:

CICD pipeline as a service that an HPC center might provide to their end users is not only in reintegrating and retesting when there are changes to the application but when there are changes to the underlying infrastructure. It would be valuable in an end user application. We would hear from our end users all the time that they wished they could hand us a suite of their applications with a test case. They wish we could ensure that their application still functioned properly and performed the same or better when we made file system changes or scheduling changes. That's something that I think as we see more people containerizing their applications and making it easy to build and test them in an automated way. That might become more and more feasible at different sites that still need to be able to do it.

Zane Hamilton:

Thank you, Jonathon. All right, Brian.

Building A Container With GROMACS [17:11]

Brian Phan:

Let's jump into our demo for today. I've set up some Bitbucket runners within Bitbucket. For today's demo, we will be building a container with GROMACS. I have the pipelines page pulled up within our Bitbucket repo. I'm going to build a definition file on our main branch. The pipeline I'm running right now will be a build-and-push pipeline. It's going to take the definition file, build it, and then push it with whatever tag I specify here. Let me go and do that. I'm just specifying this GROMACS file.

The definition file is right here. I'll jump into that after we kick this build-off. For this demo, I'm just going to tag it with the CI demo. All right, let's kick this off, and we're off. As you can see, this build has kicked off, and what it's doing right now is building my container. Let me jump into what exactly we are building.

As you can see here, this is a definition file for GROMACS. We're building this using a Rocky Linux 8.5 Docker container as our base. At the top, we have the LD Library path and path set in our environment. This ensures we have access to the correct libraries and binaries in our path. And then, at the bottom here, we have this openMP variable that allows us to run open MPI workflows as root. This is because your jobs are run as root within Fuzzball, but it is not a privileged user. Let's jump into the post section of this file, as you can see in this section.

Zane Hamilton:

Can you make that, can you zoom in a little bit? It's a little hard to see.

Brian Phan:

Yes, for sure.

Zane Hamilton:

Maybe it's just me, there we go. Much better. Thank you.

Brian Phan:

As you can see, we are installing some dependencies just through DNF right here. In this section here, we are installing some Nvidia drivers and Cuda. In this third section, we are installing some AWS and EFA stuff. This will allow us to leverage the instance types that have EFA on AWS. An example of this would be the C5N instance type. Then, finally, we are pulling down the GROMACS source, and we are building it against EFA.

What Is GROMACS [20:53]

Zane Hamilton:

Hey, Brian, I have a quick question for you. What is GROMACS?

Brian Phan:

GROMACS is a molecular dynamics software it runs through using MPI and is embarrassingly parallel. That is my understanding of it. I'm definitely not a power user of this software. But Forrest, if you have anything else to add to this by all means.

Forrest Burt:

GROMACS is a standard molecular dynamic suite. It's commonly used in the pharmaceutical field for making drug discovery and that type of stuff. Like, take 500 candidates, pharmaceutical chemicals test them against certain tissues and types. Do simulations of that performance come out in like nanoseconds worth of simulation a day? One of the biggest use cases for this is pharmaceutical drug discovery.

How Bitbucket Runners Are Setup [21:56]

Brian Phan:

Thank you, Forrest. We've covered what we are building. Let me jump into how these Bitbucket runners are set up. I'm currently using the Linux Shell runner type of Bitbucket runner. I can jump in, and these are configured using an Ansible playbook that I wrote. The Bitbucket runner is just this Java file. It's a Jar file, so I'm installing Java here. For instance, I'm also installing Apptainer and creating a Bitbucket runner user. And then finally pulling down the Bitbucket runner and installing it. Down here, I've turned this Bitbucket runner into a service. I did that just in case our server gets rebooted.

Ideally, I want the service to come back up on its own. I am setting up a docker-config, JSON, here. The reason is that we will be pushing containers to our Google Artifact registry. Within this Docker config file, I am configuring Apptainer to use the Google Cloud credentials helper. Last but not least, I installed the Google Cloud CLI to authenticate with GCP and push the built container to our artifact registry.

Jonathon Anderson:

Brian, I want to make sure I understand here. This configuration, this bit of Ansible, is configuring a conceptually arbitrary Linux machine to be a Bitbucket runner. This is something that has to be executed. Only sometimes, an integration happens. Is that right?

Brian Phan:

Yes, that is correct. Typically, you would only need to run this once. How I've set up the host file is that I give it the host's IP and pass it some variables that Bitbucket gives you. These are things like OAuth client IDs specific to each Bitbucket runner. After you set up your runners within the Bitbucket UI, you can populate the host file with these variables and then run the playbook against the instances. Ideally, like with the OAuth client IDs and stuff currently, they're in plain text for demo purposes, but in a production environment, I would like to access these with something like Vault, for example.

Build and Push Script [24:55]

Last, we can jump into the actual build and push the script that's doing the work. This is a batch script. It takes a file and a tag as a string and does low error checking here to ensure there are no spaces in the tag. Make sure that the file exists. If it exists, we use the file name, figure out an image name, give it an image path to build to, and then start the build. Once the image is built, if your container file has any tests specified within the definition file's test section, then it'll run those. If you didn't specify a tag, I would tag it with the Git commit hash. Finally, I authenticate to Google Cloud with our service account, which has the right access to the artifact registry, and then finally, the image gets pushed. That is the script that we're running today.

Jonathon Anderson:

Brian, can you give an example of the tests? Are there any tests on this GROMACS one? What might someone do to test their containers?

Brian Phan:

In terms of HPC software, my ideal test is the smallest possible test case that uses some form of input for the actual piece of software. But if you're trying to do a quick sanity check to see if your application is installed correctly, I would call the binary and try to get a help message out of it.

Commercial ISV Building Containers [27:07]

Zane Hamilton:

We do have one question, Brian. This is from Sylvia. Do you have an opinion on containers and commercial ISVs? Should the ISVs build a container, or does that have to, or does it have too many security consequences?

Brian Phan:

From the ISVs that I've worked with in the past, I'm a firm believer that their software can be containerized and can be used as a better means of distributing their software to the end user. I don't know if the rest of the team has any strong opinions on that, but yeah, I'll pass the mic to you.

Forrest Burt:

There're considerations, especially in 2022, like the software bill of materials signaling the exact providence of where your containers come from is important. Whether something the company builds for or you're building and distributing to your users, maybe I'm pointing out the obvious, but I would say that it will be very difficult for some commercial software vendors. I believe that the future is that most things can be containerized. One of the biggest problems is getting access to the proprietary software and all the build instructions and stuff to do it yourself versus having that pre-packaged from ISV. Suppose you're going to get that from a software vendor. In that case, they should be looking at their software bill materials to ensure that if you are getting containers from them, they're doing their due diligence necessary here in the modern world of containers to ensure that you're getting a safe product. Ultimately a lot of that stuff is stuff that you should be doing as well if you are containerizing software and serving it out to your users, whether that be open source or proprietary solutions that you have the means to be able to build.

Jonathon Anderson:

It would be good if more ISVs upstream were publishing containers for their software. Alongside that, if there are any security concerns, what I would want to see as a user of that container or software is just the container file that was used to build it. Even if your application isn't open source, the container build file, whether it's Apptainer, build file, NOCI builds file, or anything else. It's a script akin to that software's installation instructions. But then you get to see exactly how they built the environment in which they run their tested environment or the test for their application. That would let you rebuild the container yourself, even if their code is proprietary and not open source.

Dave Godlove:

You want to make sure also that the container you get. This might go down a rabbit hole, but the container you want to make sure that the container you get is the one that was built using that definition file. This is a great place to ask the vendor to sign the container cryptographically. Make sure they communicate to you the fingerprint of their signature so that when you have it at your shiny little terminal, you can go ahead and verify it and make sure that same fingerprint comes up. And then, if you like, you can use Apptainer inspect dash dash defile and start going through that definition file and see how they came up with what they came up with. Signing it is great, but if they signed it after they built it from an untrusted source, that could be better. You also want to make sure that providence, you trust the author, and you are happy with their source is where they got the container started.

Improvements Into CICD Pipeline [31:02]

Zane Hamilton:

Thank you, Dave. We have one more question from Sylvia. What improvements would you make to your CICD pipeline?

Brian Phan:

Oh, that's an excellent question. I was pondering about this morning. I was thinking about how to run multiple builds using the Linux shell runner on a single instance. One of the big limitations of the Bitbucket Linux shell runner is that I can only run one build per instance. The way that I approached this was to install the Bitbucket runner within Apptainer. Start it as an Apptainer instance, and then that way, you can, in theory, I should be able to run multiple builds within an instance. That is something I'm going to test. If I do get that working, that'll be a future webinar that we'll be doing.

Zane Hamilton:

All right, Brian, were you going to run this and show us something else?

Brian Phan:

Let's go back to our build and let's just see where this is at.

Zane Hamilton:

You could make that a little bigger, if you don't mind too. There you go.

Brian Phan:

We've run through the GROMACS install here and down here. You can see it's creating the CIF file. Then finally, the C file gets created in the GROMACS definition file we are looking at. No specific tests were specified, so it basically didn't run any tests. Finally, we authenticated with Google Cloud and pushed our built image to our artifact registry. That concludes my demo. I will stop sharing my screen, and we can start wrapping up.

What is a Bitbucket Runner? [33:36]

Zane Hamilton:

That's great. If you guys have any other questions, start throwing them in there, but I want to go around a little bit and ask. We talk about the Bitbucket runner. What exactly is a Bitbucket runner?

Brian Phan:

A Bitbucket runner is a service that waits for build jobs to build jobs within Bitbucket. I kick off a manual build Bitbucket runner will listen for one of those tasks, and if there is a task present, it will pick it up and run whatever build you've specified. And these builds could be like building a container or in different environments. This could also deploy to like your production environment, for example.

Zane Hamilton:

This may be a question for a different topic, but how does that relate to other tools like CloudBees or Jenkins?

Brian Phan:

I understand it is basically Bitbucket's version of GitHub runners, for example.

Jonathon Anderson:

Jenkins would be like a more genericized third party, one that isn't tied to any particular source reposit, but as the source repositories have become a greater and greater source of truth for all of these things. GitHub was one of these, and Bitbucket has one of these. GitLab has one of these if you're self-hosting or using their services. That's the one that we used in a past life of mine.

Encrypted Containers With Apptainer [35:14]

Zane Hamilton:

James asked: my environment has strict security requirements, and containers need to be encrypted. How do I achieve this with Apptainer? That's an excellent question for Dave.

Dave Godlove:

There are two kinds of encryption things within Apptainer that sometimes need clarification. But you're asking about encrypting the containers so that they can't be read. How you do that is that you generate a key, and there are commands within Apptainer because Apptainer is a full-feature program. Commands are helping you to do that. There's some documentation online on the Apptainer website for how to create. I'm wrong. You need to create the key for encrypting the container within Apptainer itself. You create your key outside of that workflow, but in any case, once you've got that key, you can go ahead and encrypt your container using the public key material. Then whenever you decide to run it, you can decrypt it using your private key material.

This works because a SIF file, a Singularity image format, is a squash FS image. With some other partitions in the file with some other metadata associated with it. Once that's encrypted, those different partitions become encrypted so they can't be read while sitting on a disk. They remain encrypted even when you run the container so that they're only decrypted within the new mountain namespace and remain encrypted on disk or in transit. We could post in the comments after the webinar, and we could put a link to that documentation. But it's pretty well documented within the Apptainer documentation how to do that encryption and decryption workflow.

Jonathon Anderson:

As you pointed out, Dave, you encrypt with the public key and then use the private key to decrypt. Handling that encryption key wouldn't be that difficult for a CICD workflow. But if you want to do the signing, which you might want to do alongside your encryption or even separately, you would do that with the private key material. But Brian, you're already handling secrets as part of pushing up to Google Cloud. I know you mentioned this, or you touched on this as you went on. How does the storage of secrets work, and how might you store a private key for container selling as part of the CD workflow?

Brian Phan:

Within Bitbucket, you can set your pipeline runners. You can set what they call secure variables. Once you set these within the UI within your build job, you can access these variables. And if they're configured as secure variables, even if you try to echo that specific variable in your script, it won't echo it. It'll just echo the variable name. That's how I've been handling our GCP secrets using that method.

Forrest Burt:

You can do that same thing on GitLab, CICD just in like one random thing that I was tinkering around with. That's common for all those platforms to be able to secure those variables like that.

Trusting Containers [39:05]

Zane Hamilton:

Sylvia asked, how can consumers of my container trust the container they are running? This goes back to what you were talking about a little bit before, Dave, but this is the cryptographic signing.

Dave Godlove:

This goes back to signing and verification. I could explain the entire workflow. Sylvie, you have created a container and are sending it to another user or scientist. They need to run it, and they need to be able to trust it. Go through the workflow of creating a key, signing your container, and then sending out of the band in a different method, sending your downstream user the fingerprint to that signature. Then I'm the downstream user who gets your container. I got this container. I pull it down. It's completely untrusted. I am still determining who created it or what's in it.

The first thing I want to do is to verify it. I would run a Singularity Verify, which will do two things. It will check that the container is actually signed, and then it will tell me the fingerprint of the key that it was signed with. That tells me it's signed. Once I do that, I can take the fingerprint. I can compare it to the fingerprint I obtained from you from the band. If they match up, that tells me you signed it. Now I know that it's not only that it's signed but also that you sign it. That lets me know now that this SIF file I've got is a bit-for-bit reproduction of the one you created.

Now the next steps concern whether or not I trust you. This is why the Apptainer model of letting anybody sign and anybody else verify the container crowdsources the trust out. Instead of saying, we have a central repository somewhere where we're going to store all official images, and we're going to assign them as a team. Then you just trust us in this crowdsourced version. You decide whether or not you trust the person who created the container. Then trust also comes down to two things. Number one, are you malicious or not? Hopefully, that's a pretty easy one. I know you, and I don't think you're malicious, but the next one is more complicated.

Do I trust that you know what you're doing? At a very high level, it can be very easy, and it is the usual use case to grab a container when you're building it from an existing container somewhere. Up on Docker hub or somewhere else. Do you know what you were doing when you built that container and when you grabbed it from the sources? Maybe instead of building it from another container, it would perhaps be nicer if I could look in the definition file and I can see that you made it from the upstream operating system mirrors because those have a higher level of trust implicitly than existing containers, which are sitting up on a registry somewhere. They have to because all of us running Linux anywhere need to trust those mirrors to run Linux. Hopefully, I went through all the different aspects of it, and there's a lot to it. It isn't very easy, but it's necessarily so.

Zane Hamilton:

Great. Thank you, Dave. Good ahead, Forrest.

Forrest Burt:

I mentioned I used this term earlier, but I wonder if I defined it well. One thing that has been starting to come out in the world of containers for a long time: we've all had this good faith assumption that the containers out on the docker hub are not malicious. The container will not have a crypto liner hidden in one of those layers or something like that. There are starting to be more cyber attacks and things like that. There's starting to be a more general concern about where exactly Dave was touching on, like the software building these base images out on Docker hub is actually coming from. One of the things that are becoming very big in this field is an SBOM or software below materials.

Software systems sometimes reach a similar level of complexity in their own ways. If you were building a jet fighter or a tank, you would have a full list of where everything was sourced. You could trace back to every manufacturer, steel plant, and bolt manufacturer exactly where every piece of that machine came from. But we still need to, especially as related to containers, have formalized where this entire container that we're using to run our multi-billion dollar website and that we have a million nodes sitting out, they're all running at once. You might have thousands, tens of thousands of machines all running this one container at once, which is some incredible scale to have that much metal working on something.

One big concept is seeing if that software came from the mirrors and it came from the repositories, that type of stuff is that software bill of materials that gives you the ability to trace back the exact providence of all the software that went in the building of your container. Ultimately it's like, well, what is that running? Know that it was built by a trusted source, and it's not just coming from somewhere random. That is big. And I don't have a specific thought about Apptainer integrating that inherently it does all of its signings and stuff like that's very useful. For your custom applications, you could have someone on your team sign off on them and build them at a very basic level. But those are things that we're very interested in exploring. We are definitely up on the software bill materials and the providence of the software wave happening right now within security.

Jonathon Anderson:

It's one of the things that's absolutely bolstered by just the process of building your containers as part of a CICD pipeline. You're building containers manually, and someone's having to manually construct a bill of materials of what's in that container and sign off on it with the key. That's an error-prone process. It's detail-oriented and finicky, and it's easy to miss something when you're building the same container repeatedly. The goal of a CICD pipeline is automated. The same process that's building the container in the first place is the process that then cataloging what went into it and signing off on that as a holistic process.

Zane Hamilton:

That's great. Thank you, Jonathon. We don't have any more questions if we're up on time, go ahead and wrap up.

CICD From An HPC Users Perspective [46:30]

Dave Godlove:

I have one more question if it's cool for Brian. What is the look and feel of this for users? This demo was great, and it was focused on how you set up this CICD pipeline from an admin perspective. I've got experience with CICD pipelines that are set up in which what it looks like for me is that I push some change to some repo somewhere, and it ends up because of the pipeline, which is there that kicks off a build process, which creates an artifact. In this case, the artifact is the container, and then that lands somewhere. Is that really what this looks like? Are there differences there?

Brian Phan:

For the end user, it depends on what you're doing. If you're creating a scientific application, it will follow that type of process. If the end user is more like a bioinformatics user that runs workflows using multiple containers, and if you were to have, like, let's say, a Fuzzball Yammel, that's pushed to a repo where you're making changes. Ideally, you would have a CICD pipeline with that where you have some input file, and then you run your workflow on it with all of these updates to all of these different pieces of software, and you should have some expected output that should be consistent. That's CICD from an HPC user's perspective.

Forrest Burt:

What about parameter suites? Could this be something where, for example, you could kick off your CICD pipeline to take an example use case of yours and try to find your specific architecture? What is the most optimal way to run it on that architecture? It runs 2,4,8,16,32, 64 cores, different configurations, tries to determine what's most optimal, and then reports that back out to you.

Brian Phan:

For CICD, I probably wouldn't run it within a CICD pipeline, but I think that would be a good use case. More like a Fuzzball workflow where you can have just a single job, and different resource configurations within that one Fuzzball Yammel, and then you could fire off all of those jobs at once.

Forrest Burt:

That's a great idea.

Dave Godlove:

And then one more quick follow up. This is another potential use case. Is this possible with this pipeline? You use the example of GROMACS. I'm an administrator responsible for ensuring that the latest and greatest version of GROMACS is always up to date on the cluster. Could I set this up in such a way that it's triggered on a major release of GROMACS within their repo somewhere and then every time there's a new major release, it automatically builds a new container and pushes that someplace that I can check it out and then pull it down to my cluster?

Brian Phan:

It is possible. The biggest challenge in doing that is monitoring for those actual GROMACS releases. The installation process across different versions is similar. I thought it would be cool if Apptainer had some templating functionality where you would have the install script and all you would need to do is template out the version, and then boom, you have the latest version of the container. Those are my thoughts on that.

Zane Hamilton:

Thank you, Brian. We have one more comment that got made; this goes back to when Dave was talking. It points out the third point they could have been thrown out. Can I trust the container to be updated if vulnerabilities are discovered in dependencies later? That's always going to be an interesting one.

Forrest Burt:

This goes back to what we were discussing with ensuring that you're keeping things up to date, you're still doing your due diligence on what's out there, and you are keeping up with the industry. There's quite a matrix to keep track of with all the component software versus all the CBEs. I need to find out what type of mass CBE matching schemes there are out there to like apply. All the software in this container to the possible CBEs for it. That starts to feel like a supercomputing task with how big that matrix could get. But I have seen, for example, after Log for Shell came out, many places added verification, like Docker hub. I think I added some verification specifically just for that bug.

Many containers now have like no Log for Shell CD detected so that it can be automated. I've already looked at doing that for some of the major ones that have come out. Or for that in vulnerability. But I can't say exactly doing that like a matrix between all the CDs. All the software looks like this because that could be very, very complex to build that entire tree out. But I'm sure it can be done.

CVE Detection [52:17]

Dave Godlove:

I want to take this opportunity to pull on a thread a little bit and talk about a very nasty topic once you start to get into it. The security community is not really clued into this, or I mean, they are clued into it. There must be a better solution to the problem, at least not presently. The CVE is not just a CvE across the board. There are vulnerabilities within operating systems that are reported as really high magnitude vulnerabilities, which are critical to patch and don't amount to a hill of beans within a container, depending on the container environment in which you're running them.

And vice versa. Some things pop up in containers that are no big deal if it was on your operating system, but because of the containerized nature of where you're running this, it becomes a real problem. And scanners and things they don't differentiate between this. They check databases and say, oh, well, you've got this file or whatever. And so that means you're vulnerable. If you want to do your due diligence and figure out what's going on, you have to use a tool like sift or whatever to create your SBO within your container and then use a grip to flag all the CVEs. But then the next hard step, which I don't think a lot of people go through, is you have to go through those CVEs and do some research and go to Miter and look and see what they're all about and whether or not they are a problem for you within your container.

You may get rid of, like, delete files and things, but at some points, you might not be able to do that. At that point, it's good to know if you even care if this CVE is something that will affect you within your container. The flip side of that, which I said I don't think is being addressed at all, is that there can be CVEs that can affect you within containers that are not being reported at the operating system level because they're not problems within the OS level. Send me nasty mail if there is something like this, but I'm not aware of scanners; they're finding stuff like that right now. It's incredibly difficult because those would be all specific to the container environment in which you're running them. You need a different scanner for every environment. This stuff gets messy and ugly fast. It needs to be addressed because there's no great way to address it.

Forrest Burt:

To say one more thing about the first side, you said not bugs. Existing containers do not exist on the operating system. Within Apptainer itself, a number of different capabilities have been built into it over time that outmatches an entire class of vulnerabilities that can be done within a container just because of how it's built there. I'll point out one example because I think it's on our YouTube channel. There was a PK exact vulnerability earlier this year that we found almost immediately you couldn't even do in a containerized environment. It was natively blocked and didn't work from inside of it. That's an interesting side of it to see, like everything that's been built over time to ensure security. But I had yet to think of that as well. What can affect a container that can affect the host? Some of that would end up as bugs on our end, but it would be interesting to explore that concept more exactly the differential between containers.

Jonathon Anderson:

While listening to Dave talk about this whole part of the conversation, CVE detection or vulnerability detection within the container changed my mind about how I've thought about containers. In the past half-jokingly been cynical about containers as reinventing, static linking, and just part of the continual go back and forth between dynamic linking of applications versus static. Now we have statically linked our entire OS together into one file. Where I was wrong to dismiss the concept that way is the way the container packages it all up together, you do still have the full understanding of what parts go into this full stack. You get to treat it like this statically linked thing you can pass around as a single shift. When it comes down to scanning it for vulnerabilities, you can see that it needs to be packaged together. Look at the packages that went into it, the versions of the libraries, even if it's some crazy non-distribution thing with files in there. You can still scan the libraries that are present. While it is work, I am glad to realize that the way it works at least makes it possible.

Zane Hamilton:

Thank you, Jonathon. This topic could go down a rabbit hole for a long time and turn into its webinar, bringing in container security and scanning experts, but we'll do that at some point. I'm going to wrap up because we are getting close to the top of the hour, and I appreciate Brian, you putting this together and showing us and walking us through that. It's always good to level on what things and ensure we use the same vernacular. I appreciate it. I appreciate you being here today if you would like and subscribe, and we will see you again next week for our round table. Thank you.