Splitting Warewulf Images Between PXE and NFS
Please note that this will work on 4.4, currently in release candidate status.
Warewulf 4 introduced compatibility with the OCI container ecosystem, which greatly streamlines the process of defining, importing, and maintaining compute node images compared to other systems--even compared to Warewulf 3! But one aspect of compute node images remains unchanged: they can quickly grow in size.
Warewulf (and the technique of PXE-booting a node image more broadly) expects that a compute node image will remain relatively small. Larger sets of software, like you might provide via an Environment Modules stack or, perhaps, via Spack, are typically deployed via a central NFS share, which is then mounted at runtime by the booted compute node. Even OpenHPC, with software packaged as operating system containers, supports this paradigm, with packages installed on the head node, landing in /opt
, and then being shared from the head node to compute nodes.
However, there are still benefits to maintaining this software as part of a compute node image; but such a large image can quickly grow to tens of gigabytes, making network booting difficult.
In this article I'll demonstrate how a full software stack can be managed together with a given compute node image, but the resultant payload can be split in-place between PXE-served netbooting and an NFS-mounted file system.
NOTE
This procedure depends on support for /etc/warewulf/excludes
, which was broken in Warewulf v4.3.0.
The root image
First, I start with the standard Rocky Linux 8 image as published by HPCng.
[root@wwctl1 ~]# wwctl container import docker://docker.io/warewulf/rocky:8 rocky-8-split
Installing some software
Using the OpenHPC project as a source, I install a set of typical scientific software. Most OpenHPC packages install software in /opt
for distribution via NFS, which is what we're going to do: just a little bit differently than usual.
[root@wwctl1 ~]# wwctl container shell rocky-8-split [rocky-8-split] Warewulf> dnf -y install 'dnf-command(config-manager)' [rocky-8-split] Warewulf> dnf config-manager --set-enabled powertools [rocky-8-split] Warewulf> dnf -y install epel-release http://repos.openhpc.community/OpenHPC/2/CentOS_8/x86_64/ohpc-release-2-1.el8.x86_64.rpm [rocky-8-split] Warewulf> dnf -y install valgrind-ohpc {netcdf,pnetcdf,hypre,boost}-gnu9-mpich-ohpc
After installing the software our image is approaching 2GB. This isn't egregious (and the compressed image as sent over the network is even smaller), but gives us a point of comparison for what comes next.
[root@wwctl1 ~]# du -h /var/lib/warewulf/container/rocky-8-split.img{,.gz} 1.8G /var/lib/warewulf/container/rocky-8-split.img 651M /var/lib/warewulf/container/rocky-8-split.img.gz
Excluding the software from the final image
Warewulf consults /etc/warewulf/excludes
within the image itself to define files that should not be included in the built image. For our example here, I exclude the full contents of /opt/
, in anticipation that we'll be mounting it via NFS in stead.
[rocky-8-split] Warewulf> cat /etc/warewulf/excludes /boot /usr/share/GeoIP /opt/*
Rebuilding the image with /opt/*
excluded, the image is reduced in size, and further software installation would no longer increase the final size of the image delivered over PXE.
[root@wwctl1 ~]# du -h /var/lib/warewulf/container/rocky-8-split.img{,.gz} 1.1G /var/lib/warewulf/container/rocky-8-split.img 483M /var/lib/warewulf/container/rocky-8-split.img.gz
Exporting the software via NFS
With the software in /opt
excluded from the image, we need to export it via NFS in stead. This is relatively easily done, though we must discover and hard-code paths to the container directory.
[root@wwctl1 ~]# readlink -f $(wwctl container show rocky-8-split)/opt /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt
Add an NFS export to /etc/warewulf/warewulf.conf
, restart the Warewulf server, and configure NFS with wwctl
. Note that I've specified mount: false
for this export, as I want to control which nodes will mount it: presumably nodes that aren't using this image should not mount this image's software.
nfs: export paths:
- path: /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt export options: rw,sync,no_root_squash mount: false
[root@wwctl1 ~]# systemctl restart warewulfd [root@wwctl1 ~]# wwctl configure nfs
Mounting the software on the compute node
We can mount this new NFS share just like any other, by listing it in fstab
.
Warewulf typically configures fstab
as part of the wwinit
overlay. In order to mount this NFS share without setting mount: true
for all nodes, I copy fstab.ww
to a new overlay and add an additional entry.
[root@wwctl1 ~]# wwctl overlay list -a rocky-8-split OVERLAY NAME FILES/DIRS rocky-8-split /etc/ rocky-8-split /etc/fstab.ww
[root@wwctl1 ~]# wwctl overlay show rocky-8-split /etc/fstab.ww | tail -n1 {{ .Ipaddr }}:/var/lib/warewulf/chroots/rocky-8-split/rootfs/opt /opt nfs defaults 0 0
I can add the new overlay to our wwinit
list, and the fstab
in rocky-8-split
will override the one in wwinit
. (Note: --wwinit
was specified as --system
in Warewulf 4.3.0.)
[root@wwctl1 ~]# wwctl profile set --wwinit wwinit,rocky-8-split default [root@wwctl1 ~]# wwctl profile set --container rocky-8-split default
From a compute node, we can see that /opt
is mounted via NFS as expected.
[root@compute1 ~]# findmnt /opt TARGET SOURCE FSTYPE OPTIONS /opt 10.0.0.3:/var/lib/warewulf/chroots/rocky-8-split/rootfs/opt nfs4 rw,relatime,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,local_lock=none,addr=10.0.0.3
We can further confirm that /opt
is empty on the local, PXE-deployed file system.
[root@compute1 ~]# mount -o bind / /mnt [root@compute1 ~]# du -s /mnt/opt 0 /mnt/opt
Future work
As demonstrated here, we can already implement split PXE/NFS images using functionality already in Warewulf; but future Warewulf development may simplify this process further:
Container path variables in warewulf.conf
We could support referring to compute node images in warewulf.conf
. For example, it would be nice to be able to replace
nfs: export paths:
- path: /var/lib/warewulf/chroots/rocky-8-split/rootfs/opt export options: rw,sync,no_root_squash mount: false
with something like
nfs: export paths:
- path: {{ containers['rocky-8-split'] }}/opt export options: rw,sync,no_root_squash mount: false
This way, our configuration would not have to hard-code the path to the container chroot.
Move NFS mount settings to nodes and profiles
Right now, NFS client settings are stored in warewulf.conf
as mount options
, mount
, and implicitly via path
; but if these settings were moved to nodes and profiles we could configure per-profile and per-node NFS client behavior without having to manually edit or override fstab
.