Container Education Series
We are excited to bring you a new webinar series! Our Container Education Series with live demos and, as always, a great discussion!
Webinar Synopsis:
-
What Does it Mean to Build a Container?
-
Switching Between Formats
-
Is This a Good Way to Test Containers?
-
Definition Files
-
Sha1sum
-
Post Section
-
What Happens if You Don’t Use Force?
-
Is the Post Section Definitively Bash or Shell?
-
Environment Section
-
Can You Overwrite Environment Variables?
-
.singularity.d/
-
Env Sub Directory
-
90-environment
-
Definition File
-
Labels
-
The Help Directive
-
Test Section
-
Is Authorship and Metadata Included in the Payload?
-
Are Changes Captured in the Final Container?
-
What Labels Look Like
-
A Few More Things About Definition Files
-
Setup
-
Pre
-
Troubleshooting
-
Different Bootstrap Agents
-
Local Image
-
Debian Type Container in Ubuntu
-
Scratch Build
Speakers:
-
Zane Hamilton, Vice President of Sales Engineering, CIQ
-
David Godlove, Solutions Architect, CIQ
-
Forrest Burt, High Performance Computing Systems Engineer, CIQ
-
Jonathon Anderson, HPC System Engineer Sr., CIQ
Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.
Full Webinar Transcript:
Zane Hamilton:
Good morning, good afternoon, and good evening. Wherever you are. We appreciate you joining us today. My name is Zane Hamilton. I'm the Vice President of Sales engineering here at CIQ. For those of you who are unfamiliar with CIQ, we are a company focused on powering the next generation of software infrastructure, leveraging the capabilities of cloud, hyperscale and HPC. From research to the enterprise our customers rely on us for the ultimate Rocky Linux Warewulf and containers support escalation. We provide deep development capabilities and solutions. These are all delivered in the collaborative spirit of open source. For today's webinar we are going to talk about containers. Today we are joined by Forrest Burt, Dave Godlove and Jonathan Anderson. Could I have everyone introduce themselves?
Forrest Burt:
I'm Forrest Burt. I'm an HPC systems engineer at CIQ. I'm excited to be on the webinar today. Thank you for having me.
Zane Hamilton:
Jonathon, will you introduce yourself?
Jonathon Anderson:
My name is Jonathon Anderson. I am an HPC engineer with CIQ. I'm also excited to see what Dave has for us today.
Zane Hamilton:
Dave Godlove, will you introduce yourself?
Dave Godlove:
I am a solutions architect at CIQ. I have been involved in the Apptainer/Singularity community for a long time. I'm excited to talk to you about building containers today.
Zane Hamilton:
Dave, would you start our discussion about building containers?
Dave Godlove:
I am going to share my window and we can get started. A few weeks ago we did an introduction to Apptainer. We discussed the look, feel, and features of Apptainer. Today we are going to go even deeper and discuss how to build containers.
What Does it Mean to Build a Container? [02:10]
I want to begin in a philosophical way. What does it mean to build a container? Containers could be a lot of different things, but for our purposes today, a container is anything that gives you a SIF (Singularity Image File) file. This is the file format we use for containers in Apptainer. By that definition is this building a container? When I pull a container down from a container registry, such as this library that I have configured, you end up with a SIF file. You could say that this is considered building a container, but that is considered just downloading.
Let me give an example. I’m using the pull command. Instead of pointing at this library, I am pointing at Docker instead. When I pull this container from Docker, I download a bunch of layers. These layers are known as tarballs. Then the tarballs are untarred. After that I use OverlayFS to take all the tarballs and combine them into a SquashFS file. Then I put the combination into a SIF file. Even though this is a simple pull command, this is building a container.
This is taking a bunch of different pieces and putting them together to make a container. The reason I want to show you guys this is because many individuals won’t think about doing this. You can build your container like this from URI. I believe this syntax is something that not many people are familiar with. This works just fine. Apptainer is quite happy to build your container like this. And if you actually looked under the hood to see what was going on, you would find that Apptainer follows a very similar code path to what I recently did. I did this to discuss the differences between pulling and building. I also wanted to discuss their similarities and about building from different sources. Right now we have pulled or built from the library. We have pulled and built from Docker Hub.
When we first start talking about containers, we have to start talking about containers as Bootstrap agents. These are agents that you use to start building your container. You can also build your container from pre-existing containers. They are right there on your file system. Since I just built this container, I now have this Python SIF container. Not only can you build SIF files, but you can also build sandboxes. You can build directories. They are bare directories and inside they contain a file system. Since you can do both those things, you can build from a pre-existing container that is on your desktop. You can also build to a sandbox. You can use the build command to switch between different formats.
Switching Between Formats [06:15]
Let me show you how that works. I'm going to go build a directory called Python from this python.sif. Now I've got this directory called Python. If I look inside of it is a root file system. That is a nice trick you can use to debug your containers. It is quite difficult to find things if you shell into a container and I will give you an example. It’s not unusual for someone to run and install things in root. They will have access to anything installed in root. If you're running a container that was built using Docker by a container, then you're not going to have access to the root directory. This is because you don't have the permissions to do that. Upon entering the container it is possible to enter as root. If you do that by default your home directory as root on the file system will be overwritten into the container at runtime. It can be very difficult to figure out where things are installed because of that. You can convert your container to a sandbox. You can just look in here and figure out is there anything in route? In this particular circumstance there is nothing in root.
This is also a cool trick that you can use if you're just trying to figure out how to build your container for the first time. You can shell into it with “writable” once you have your container as a sandbox.
Then I can do things like update the container. I can install things into it. For instance if I'm trying to install something for the first time, it has several dependencies. I am not always aware of what it needs. I can mess around in the sandbox and record what I do in the sandbox. Then I can go back and use this as a substrate to figure out how to build my container.
I'm going to install them now because I told you that you can swap back and forth between different formats on disk by using the build command. You might think that you are ready to go and build.
Dave Godlove:
A new SIF file from that directory that I just created. You could do that. Now I've installed them inside there and I didn't update. Now the container is a little bit different. It's not considered the best practice because you're not going to have any record in the container. such as a definition file. We will talk more about definition files later. You won't have any record that this container was updated or that VIN was installed. In general, if you're going to noodle around and try to use this as a development substrate to build a new container, you probably want to record all your changes in a new definition file and then use that ultimately to build your new container. If you’re planning to use the sandbox to mess around and use this as a development substrate to build a new container, you will want to record all changes in a new definition file. Then use that to build your new container.
Is This a Good Way to Test Containers? [10:40]
Zane Hamilton:
Would you say this is a good way to test your containers? Just to repeat what you said, you're building your definition file, build it from that file system, test it so you know that it works and you're not having to rebuild every time/change your definition file every time you're laying.
Dave Godlove:
I like to develop by shelling into a container with two windows open.One window will have your definition file and the other one will be where you interact with your container.You will try many different things here. You will continue to record things that work in your definition file until you break the container. Then you willl build your new container from your definition file, shell into it again and continue to mess around. recording as notes in your definition file what works, and continue to do that until you ultimately have everything that you want to have installed. You've got your definition file, you build it one more time, and then there you go. Ultimately you will have recordings of what works in your definition file. You will continue to do the above steps until you have everything you want installed.
Definition Files [11:38]
We should talk about definition files. I have a definition file. I'm going to remove all my Python stuff. I have a definition file called example.def. I have been messing with it and have a lot of things commented. I will comment everything.
This is an example definition file. This has all the commonly used sections in it that you would use to build a container. A definition file begins with a header. After the header it contains many sections. I have all the sections commented out so that in this instance none of them are going to run when we begin. At this point in the header, this is where you tell the container the base container you want to start with. I mentioned earlier the concept of a bootstrap agent. This is where you specify the bootstrap agent that I want to use. Depending on the bootstrap agent, you'll have different keywords that you can add to your header. Here, I'm saying I want a bootstrap, a container from Docker hub, that's what the docker bootstrap agent means. Then I'm saying I want to start from Ubuntu.
This syntax might be unusual to most people. Most people might be more familiar with something like this or some tag located here. I don't know if this is actually a tag that I need to look at. You will have to look on Docker Hub to see. It might say something like this. The reason I'm trying to highlight this particular syntax is because if you recently bootstrapped from the latest version of an image, then the latest one might change. If you run the same definition file to build a new container tomorrow, it might not be the same as the one that you built today. This is true for all tags. While this is not the best practice, developers can swap tags with new containers anytime they want.
Sha1sum [14:00]
You can use a sha1sum to specify the image that you want.This is the only one I want. If you do that and then if a developer removes this image from Docker hub, you'll get an error that says, “I can't find this image anymore.” This will alert you that things have changed. There are no more silent errors. You will receive a specific error telling you this is not going to work anymore. You need to have a plan to move forward.That is the header. We're going to talk more about different bootstrap agents later.
Jonathon Anderson:
I just want to mention the benefit of using a SHA1.You’re pulling these down from the normal docker.io registry. With SHA1 you could point at a different registry with the knowledge that you are receiving the same container. No matter where it comes from.
Post Section [14:46]
Dave Godlove:
That is exactly right. I want to start going through the different sections. We'll talk about the different bootstrap agents and different headers later on in my presentation. The most common section that you'll encounter in almost every definition file is the post section. The post section is named that way because it runs after something. It's after something and that something is that it's going to download your container. It's going to jump into the container. After the container downloads and is created, it's going to jump into the container and do this. This is some bash scripting. It is a little scriptlet. Anything you can write in Bash, you can write in post. At this point I'm setting the environment variable, DEBIAN_FRONTEND=noninteractive. This is going to ensure that apt doesn't prompt me for any input when I use it. I'm going to set this PKG variable to wget. Then I'm going to use apt to install the package.
When I build it, I'm going to use this fakeroot directive and I'm going to use force. This means if this container already exists overwrite it. It doesn't exist right now, but I'm going to build the same container over and over again. We might as well just start using force.
What Happens if You Don’t Use Force? [16:39]
Zane Hamilton:
If you don't use force, does it just override it or does it air out?
Dave Godlove:
It does neither. It prompts you and gives you a yes/no prompt and says, “are you sure you want to do this?” I can demonstrate this later if you want.
Is the Post Section Definitively Bash or Shell? [16:56]
Jonathon Anderson:
Is the post section always explicitly and definitively Bash or Shell? Can you tell it to use different shells?
Dave Godlove:
I am not aware if the current version of Apptainer functions this way. There have been past versions of Singularity/Apptainer that allowed you to specify the run script. For instance, you could specify a different Shebang and you could use that to do things such as create a run script that was written in Python. I don’t know anything specific about that relating to post. I don't know if the current version of Apptainer does that. I would need to do some tests to figure that out. I'm not prepared to say definitively right now.
It went through and everywhere you get a little plus sign. This tells you what you ran during your post section. I set the environment variable, I installed wget. Then it created the SIF file. Now, if I did something like exec to make sure wget is there. That is pretty straightforward. This is analogous. That post section is analogous to anything that's prefaced with the run directive and a Docker file.
Environment Section [18:32]
I'm going to go through the environment section. This is another very common section. I've explicitly created the same environment variable in the environment as I've created in post. I've done that to highlight that these two are fundamentally different. If you create an environment variable in post, that environment variable will be available to you at build time. It will not be available at runtime. If you create this FOO=bar that's not going to be available during build time. I just wanted to highlight that. You will also you'll see LC_ALL=C in the environment. This is setting your locale. There are different things like Pearl and some Python packages, which care about how the locale is set. This is a simple way to set it generically and make sure that you don't get warnings and errors.
In the interest of time allow me to uncomment a few of these before I go ahead and build this. I’m goint and uncomment the next two and explain what they're doing. Here we have a run script section and I should also highlight that here in the environment section. This stuff actually gets, um, gets run, it gets run. This is a scriptlet also which runs. Typically you would just set environment variables here, but you can do other things. It's usually not considered a best practice to do it in the environment section. If you want your container to do something dynamically, you should place that in the run script. Now we have this environment’s variables.
Can You Overwrite Environment Variables? [20:24]
Zane Hamilton:
I have a question on the environment variables. Can you overwrite those on the command prompt? Can you overwrite those variables as you execute a container?
Dave Godlove:
Yes, you can do that. In fact you can do that. There's a special environment variables within Apptainer that you use to inject environment variables into the container at runtime. There are special environment variables that you can use to prepend and upend to the path at runtime. There are ways to change these at runtime if that's what you're looking to do.
Zane Hamilton:
That is great. I've run across containers where that was nice to be able to do so.
Dave Godlove:
The run script is just like a little scriptlet. You're going to write Bash here or if you change the Shebang to something else. Changing the shebang in the run script is not as easy as just writing a new Shebang here. I actually forgot off the top of my head how to do it. I should have brushed up on that before doing the demo. I'm pretty sure with the current version, Apptainer will change that.
Jonathon Anderson:
I thought we had it confirmed that it is possible.
Dave Godlove:
I haven't messed around with that since I was in the two x series. That's something that I fought really hard to have the file section included. You put a path to a file on your host system and then a path to your file within your container. Then it copies files from your host system into your container at build time. So that is the syntax there. Let me go ahead and run this again and we can see what's going on inside the container.
Jonathon Anderson:
While that's building, you mentioned that the environment section is a scriptlet similar to post. The one place where I've used that to good effect is if I install HPC applications inside of a container and it comes with ElMod module definitions inside. I used environment to load environment modules. Then load those modules into the environment so they would be ready for the run script to run. To do that you source an ElMod.sh or something like that. I was pleased to discover that that worked exactly as I wanted it to.
Dave Godlove:
That is a very cool and creative use of the environment section. That's one of the things I like about Apptainer. There are the best practices of how you can use it, but it's flexible. There are other things that you can do with that. I used the files directive to copy a file called too many daves. That is a little Dr. Seuss poem. I decided to include it because there are lots and lots of Daves in the Apptainer/Singularity community. I thought that would be funny.
I’m going to open up this zip file. I copied Dave's poem with the file section.
.singularity.d/ [24:04]
I set some environment variables. Let's have a look at those. FOO got set appropriately. I trust that the others did as well. I also created a run script and I want to show you where that and the environment variables end up. If I cd to the root and my directory. Then there's actually a hidden directory here called .singularity.d/. We're going to start getting into more advanced content. So this .singularity.d/ directory is where configuration for the container itself lives within the container. If I look at that, I can see this thing called Singularity. I can see a directory called env. I can see something called runscript. Let's look at that really quick. So that's the runscript that I put into the definition file. Notice it's got the Shebang here. By default, Shebang is added for you. If you add your own Shebang at the top of the runscript, it won't do anything. It'll just add the Shebang underneath. In order to do that you must changea directive that you add to the end of your run script section.
Env Sub Directory [25:55]
Let's look at the env sub directory here. This is a whole lot of just little scripts. These little scripts initially set up the base environment. This is what the path ought to look like. If you've imported this from Docker, it takes the environment variables from Docker and it actually puts those in.
90-environment [26:31]
In 90 this is where the environment variables that you've put in yourself end up. When you run the container, all of these scripts get run in order and it layers up the environment variables. The one you were talking about Zane, you might inject other environment variables at runtime. Those get written into this script and sourced last. this way they overwrite anything which was sourced here in an environment. There's some other stuff here too that has to do with apps. You can install multiple different apps, There is a way to do that within a container. I'm not going to get into that, but that's a pretty cool feature that we've had in Apptainer for quite some time. That's how the environment works.
Definition File [27:32]
Then there was this mystery file called Singularity here. Back when Apptainer was still called Singularity. l Like everything in its brother was called Singularity.
This happens to be the definition file that was used to create this container. This is how you always have a record of what was used to create the container, when you build it. It's worth noting too that even when you do one of those pull or build commands with container, the container translates that command into a little short definition file. It uses that to create your container. It will jam that definition file into this Singularity file. For instance, you can still see that you used the Docker bootstrap agent and you got it from the following container
Labels [28:31]
Labels is a section that you can use to put arbitrary metadata into the container. This stuff doesn't actually end up in the container itself, but it ends up in the SIF file. You can use this to do things such as declare your authorship or what version of the container it is .
The Help Directive [28:57]
The help directive is not actually a scriptlet. It is only text. If you use the run help command with the container, afterwards it will display this text to you. You can use this to give your container some help for users down the road.
Test Section [29:21]
The test is an interesting section. At the conclusion the test section will run, unless you tell specifically not to. After the container is built you can run the test section again. It’s possible that you run it and it won’t make any sense.
Is Authorship and Metadata Included in the Payload? [29:47]
Jonathon Anderson:
Do you know if the authorship and other metadata is included in the signed payload? If you sign a container?
Dave Godlove:
Yes, it is.
Are Changes Captured in the Final Container? [30:02]
Jonathon Anderson:
That makes sense. My second question is if the test scriptlet changes such as a change to the file system, if it writes something out, will that change get captured in the final container or is it discarded?
Dave Godlove:
That's a good question. I don't think it can change the file system.
Jonathon Anderson:
Do you think it's read only by that point?
Dave Godlove:
I think that at that point the container has already been created and I think the test gets run. It can't change anything that's root owned cause it runs with normal permissions. That'd be an interesting thing to play with. I don't know.
At the end here, we have extra info telling us that we've added labels, a test script, and we are running the test script. It tells us that wget has been installed. If I want to run that again, I can do something such as test wget. The test script is added and is run before the SIF file is created. At this point, the container is not immutable. This is why I'm thinking about your question and I'm not totally sure whether you could alter the file system with it or not. That's a good question. It is something I’ll have to mess around with later.
What Labels Look Like [31:44]
I wanted to show you what the labels look like really quick. If you wanted to inspect the container for labels. This way your container ends up with a lot of metadata. This happens whether you put it there or not. It's all called org.label-schema. It looks like this due to the oci formatting. You can see that I added some arbitrary metadata as well. The version of the container and username/email address. Then finally you can run help on the container and all that text that you put in the help section will be outputted. In the future you or someone else can put all that in there.This way the container is well documented.
A Few More Things About Definition Files [32:58]
At this point I'm going to get into some dicey stuff that you probably don't want to mess around with. It occurred to me after I was putting this together that there are a few other things you can do with definition files. I'm not really covering this topic today. One of those is there are keywords you can use to set up different apps. I don't know how widely this is used, but it's a pretty cool feature. You can install multiple different apps in your container and you can do so in such a way that you can actually use the app. The container itself is made aware of the different applications which are installed in the container. And you can call the different apps through Apptainer itself. That's a whole other topic that that would take a long time to go through.
I encourage you to look at the container documentation. It's pretty well documented and it's a nifty little feature. Another thing that I'm not going to be covering today is multi-stage builds. For example, if you have some complicated build tool chain that you have to install to compile some code in the container. You don't want that in the ultimate container because it's just going to be bloated. and whatever. You can create one container that has all the build tools installed in it. Compile your code and then in the same definition file create another container. This has all the libraries and stuff that you compiled against whatever it is that you need and just dump that compiled executable into the new container.
The first container will be a throw away container. This is a common and straightforward thing that most people who use Docker are already familiar with. Go creates statically linked executables by default. A lot of times you'll just compile some go code and then you'll just take the resulting executable to dump it intoa bear or very small container. And then you have a container that runs the go executable. Check out the Apptainer documentation. It’s well documented how to do those things.
Setup [35:11]
Now I want to talk about a couple of sections which are not really very widely used and when you would and wouldn't use these. Setup runs on the host system, not inside the container. It runs after the container's been downloaded. It used to be that if you wanted files to go from your host system into the container, you would use setup Since that time then the files section has been added to the definition files. Setup is not really used anymore. It's dangerous and I'm going to show you why it's dangerous later.You probably shouldn't use this unless you've got some weird quarter case that you absolutely have to do.
Pre [35:59]
Pre can also be a little bit dangerous because it also runs on the host not in the container. It runs before the container gets downloaded. Like post means after pre means before and it's before the container's actually downloaded. Since it runs on your host system, it's a little dangerous to use. There may be some places in which you really do want to use it. I'll show you why later on.
Troubleshooting [36:38]
I’m going to build this again.
Now we are doing that pre-scripted and it's echoing. Don't use that because that's what I put in pre unless you know what you're doing. Then sleeping for three seconds. We've got time to read that. Then it tried to do that setup and it said, “it failed because you got permission denied.” Why do I have permission denied? I ought to be able to create a file called set called slash made by setup, right? I ran this as fake root. Maybe I should just run it this real root.
I was hoping it will let me create a setup file in my container. Wouldn't that be wonderful? It’s going to go through and do all this stuff. I think I'm getting messages to the effect that I've been fired.
Hopefully my container worked this time. My internet is being slower now. Now I should see inside my container that it made that file that I was supposed to be able make like made by setup or something I called it. Unfortunately I don't see it here.
It’s irritating, but l look at my root file system of my host. I've got this file made by my setup now at the root. It's owned by root that I made with a definition file. That's why setup and pre are a little bit dangerous because they run on your host system and sometimes you build containers as root. If I decided to delete a bunch of stuff, for instance, I'd be in big trouble because especially if this was a host system that I really used a lot and cared about. You probably shouldn't use those. Let me go ahead and just remove this. Are there any questions before I talk about different bootstrap agents.\?
Jonathon Anderson:
I think you mentioned that more than setup. Are there instances where you might want to use pre? Did I miss that or did you talk about some reasons to use it?
Different Bootstrap Agents [39:53]
Dave Godlove:
As I start talking about different bootstrap agents, pay attention because there is one in which pre becomes important. It's probably one that people don't know about very much.
Okay, I have here in my directory called Apptainer. This is the Apptainer source code that I just grabbed directly from GitHub. You can grab this too. If you go into it, you'll see that there is a subdirectory called Examples. Examples has all these different short little definition files. Mostly what they illustrate is different bootstrap agents. So rather than just write a whole bunch of these myself I decided to just use this directory. I've written a couple ones that aren't highlighted here very well. I guess make some poll requests and try to highlight those a little bit better. For the most part this is a pretty nice and complete library of different bootstrap agents that you might use. The Docker Bootstrap agent grabs things from Docker hub. That makes sense. In this case we are grabbing the official busy box.
There's also a library so you can configure Apptainer to point to a library. A library is a container registry, which contains native SIF files l. Instead of grabbing something that's a bunch of tarballs, like an OCI format, you can actually grab things that are native SIF files. That's what this looks like. This particular one points to the Sylabs library and the official Alpine 3.11.5 container. There's another cool one that allows you to interact specifically with SIF and that bootstrap agent is called oras. oras is a OCI method for storing arbitrary data in an OCI or docker style registry. This is cool because it means that you can store SIF files in an OCI registry. In this case, we're using the oras bootstrap agent to download a container from the GitHub container registry. You can grab their stuff up on the GitHub Container Registry. You can use that yourself to store your containers and then you can grab them and use them as base containers to build your containers.
Local Image [42:58]
There's another cool one that I want to talk about. It is less known. This is the one or the ones that I wrote specifically for this demo.There's a bootstrap agent called Local Image. Let me cd into that local image directory. I have a SIF file that I've created. If I look at the local image definition file, the bootstrap agent is a local image. You give that either a relative or a full path to a SIF file. This is your local host.
After you do that you can use it as a base in your definition file. That's pretty cool. Little less reproducible than some of the other methods that we're showing here, but, um, could be handy if you're doing development. Okay. Now, all the different bootstrap agents that I've talked about so far, you start from an existing container. there are some drawbacks to doing that. Um, what if you want to create a container, and you don't want to start from an existing container, you want to start from something else. Maybe you want to start from a mirror, Url from which you would get the components for a file system like Rocky Linux for instance, or maybe in Ubuntu or Debian. How would you go about doing that? Let's take the example of Rocky. Let me cd up one level.
The way that you would bootstrap new Rocky Lennox images is that you would use Yum to create your file system. This is the way you do it if its not based on a container. This is going to run outside of your container because it has to put this thing together and make a container for it to jump into. For this to work, you have to have Yum installed on your host. Once you've got the Yum, bootstrap, bootstrap agent, you get some different keywords that you can add to your header. For instance, we have the mirror Url keyword. This included keyword takes the configuration of Yum on your host system and puts that into the container at runtime. This way you'll have a functioning usable Yum in your container. This is something specific to RPM based images that start from a mirror.
This is cool because almost all of the container ecosystem depends on existing containers and starting from existing containers. When you start from an existing container, you have to trust a bunch of people. You have to trust the entity or the organization or whoever that built the container. You trust that they didn't put anything malicious in it. You also have to trust whoever is administrating wherever the container's owned or wherever it's stored. When I download a container from Docker hub, I am implicitly trusting the developers of Docker hub itself. We are trusting they are putting some security best practices in place and are themselves not malicious. I'm sure they're not.
I've grabbed this container from this third party registry and I trust that everybody protected it and did what they were supposed to do to make sure it doesn't get messed with. I'm trusting that whoever originally created this container didn't mess with it. If you do this instead, really the only entity that you have to trust is whoever's maintaining these mirrors. You ultimately have to trust one way or another. Anyway, this is if you're paranoid. Everybody ought to be right, this is a much cleaner and better way to start off building your containers. You lose some key features by doing this. It becomes clunkier and harder to write your definition files. You can't use things like signatures and things, you can't have your containers encrypted. I'm going to show you, there's a compromise that you can do where you can \ build yourself your own library of containers to start off with from mirrors. You can sign them all to make sure that they don't get tampered with. Then you can use those as your base containers that you start off with. You can always be pretty well assured that you're starting off with a good, nice, clean container at the beginning of your build.
Debian Type Container in Ubuntu [47:49]
If you wanted to build a Debian type of container in Ubuntu, for instance that looks a little bit different. In that case, instead of using Yum or instead of using an App. There's a tool called debootstrap. You would use this to create your file system. This is specifically a tool programmed to create Debian style file systems. The tasks are to be installed on your system. This is weird. I'm running a Rocky host system here, but I actually have Bootstrap installed on it, which is strange. This works fine.
Scratch Build [49:02]
This is the final thing I want to show. I see very few people using this, but it's pretty cool. One of the community members contributed this code a long time ago and had this great idea. I want to show you a scratch build. This is hardcore. You gotta really know what you're doing to do this.
Let's say there's two here, there's Alpine and busy box. I'm just going to look at Alpine really quickly. There's this bootstrap agent called Scratch and what it means is give me an empty container that doesn't have anything in it., give me nothing and then I will do everything. Going back to Jonathon's question, here's where the foreshadowing pays off. This is where you would actually use something like setup. Because remember, that setup runs before your container gets downloaded and it runs on the host system. In this case, what we're doing is we're defining an environment variable, which is a url. Then we're curling that URL to grab this Alpine file system. It's a Tarball, we're uncaring it and we're sending it to this special environment variable called a Container RootFS. This is only available at build time during the setup section.
It basically extracts the root file system into our container and then we're just excluding the /dev and etc hosts files and directories because we don't need those for our container to work properly. With this scratch bootstrap agent, you can install anything you want inside of a container. You could go through and use debootstrap, Yum any of the other. You could use Pacman, you could use basically anything that you can create a file system. It doesn't even really have to be a root file system, but your container's not going to work very well under Apptainer if it doesn't have a root file system in it. Probably not, but it is possible. In any case, this allows you ultimate flexibility. However you want to build your container, go ahead and do it. I will take any questions about bootstrap agents.
Jonathon Anderson:
You mentioned that you lose out on some functionality like signatures when you use a Yum bootstrap agent. Why is that? Why wouldn't you be able to sign a container like that?
Dave Godlove:
That is a great segway into our next topic. Its not that you're unable to sign a container like that. You can easily sign a container like that. In fact that's a great thing to do, but you can't start like that.
Jonathon Anderson:
The point of origin doesn't have a signature. I see.
Dave Godlove:
I hesitate to say that because it does
Jonathon Anderson:
The repository will have signatures and things like that.
Dave Godlove:
If you could download a SIF file that already has the root file system already in it versus building from scratch the root file system every time you want to build a container. It would be a lot better. to download the existing files especially if you have a slow internet connection. It's a lot easier to start with an existing root file system. To do that you can create something like this should be like practical, real world kind of stuff. When I was at the NIH, one of the things I was tasked with was to come up with a library of base containers that we can vouch for that we know right off the get go don't have malicious code in them. We'll use those as the base containers that we build our containers off of, right? And so those still exist. They're up in the library. If I do something like container pull library
I've got it in a collection that I call secure in my container.
What's it named? I did this in the repo. I didn't mean to do this in the repo, but in any case, here's what it's called: CentOS eight latest .dev. So I can, um, look and see the container inspect the F file. Okay, so this is the definition file that I used like two years ago to create this container. I built this specifically right from the mirrors, so I didn't use any intermediate-like container that somebody else had built. Um, and I didn't even use a container registry. I just used the mirrors. How do I know this thing's been sitting on a third party registry though for, for the past two years? How do I know that this container hasn't been tampered with? Well, that's where your verify command comes in.
I can verify, I can go look up my key and I can verify that's the signature that I signed this thing with two years ago. Because I know that this container is a bit for bit reproduction of the one that I made a few years ago and that it was made from these mirrors. I don't really have to trust anybody. When I use this as a base container, I can be pretty confident that this is clean and it has in it what I know is supposed to be in it, um, to use this as a base container. And I've got like a, a couple of CTO containers of a couple different versions, a few and boon two containers, a few alpine containers. All those can be used as nice clean base images to start off with. Cool. Any questions about that workflow? I didn't, let me close the loop though, because how do you, how do you check the signature when you do a build, right? Do you download the container and then do a container, or then do a container verify and then build from the local container? No, you don't have to do that. So
Right there on the definition file, you can put that fingerprint keyword. And this is really nice because the container's going to bomb out immediately. It's not going to try to run anything if it sees that this container is not what it thinks it's supposed to be. Um, so it'll just run the verify command on your behalf at build time and as long as the container checks out and it's good, it'll go ahead and build your container. Then you'll have this record right in the container in the definition file of what you built and where you built it from and what the fingerprint was and everything you need. All right, cool. That, that, hopefully that closes a loop. Um, and no
Zane Hamilton:
I think so. No, it's fantastic. Thank you. So if we don't have any questions and I haven't seen any come in, I'll ask if anybody has any things they want to say before we leave or right up on time? Closing thoughts, Jonathan?
Jonathon Anderson:
No, this is great. I, like you said, Zane, a lot of the specifics of some of the, the outside, um bootstrap agents were something that I hadn't seen. So thanks for going over it with us, Dave.
Forrest Burt:
That is absolutely true. There's a lot of info there that I didn't know so far such as those bootstrap agents. There were a lot of great tidbits.
Zane Hamilton:
Thank you for the comments. On that note, we will wrap up. Dave, thank you very much. Join us next time in two weeks for the next of this series of education with Apptainer. Like can subscribe and we will see you soon. Thanks guys.