CIQ

Educating the Next Generation of HPC Engineers with Open Source Tools

Educating the Next Generation of HPC Engineers with Open Source Tools
Matthew Fricke, University of New Mexico, Department of Computer ScienceFebruary 19, 2025

Preamble

One of the highlights of the annual SC conference is the student cluster competition, where teams of students from around the world compete to build, administer, and benchmark a high performance cluster with time, budget, and power constraints. This past year is no different, and we were happy to meet multiple teams running Rocky Linux on their clusters. We were additionally excited to meet with the team from University of New Mexico, including Professor Michael Fricke and his TA, Ryan Scherbarth, who were also using Warewulf to manage their cluster!

In conversation after the conference, we learned that Professor Fricke has designed an entire HPC course using Rocky Linux, Warewulf, and Apptainer!

We’re beyond pleased for our work to be a part of educating the next generation of HPC (High Performance Computing) professionals. We hope you’ll enjoy his guest post here, highlighting the motivation and process behind his work.

Educating the Next Generation of HPC Engineers with Open Source Tools

High-Performance Computing (HPC) isn’t just the domain of national labs or massive enterprises anymore—it’s becoming an essential skillset for tackling modern technological challenges. At the University of New Mexico, an HPC course is equipping students to dive into this space, leveraging cutting-edge open-source tools like Rocky Linux, Warewulf, Slurm, and Singularity (now called Apptainer). The goal? To teach students how to build and manage their own HPC clusters and prepare them for real-world careers in computational science and artificial intelligence (AI).

HPC Skills Matter More Than Ever

HPC powers breakthroughs in nearly every field, from AI and data science to climate modeling and national security. As AI systems grow in complexity, they depend on the scalability and raw computational power of HPC clusters to process massive datasets and train advanced models. This symbiotic relationship between AI and HPC means that having a foundational understanding of these systems isn’t optional anymore—it’s essential.

But there’s a problem. In states like New Mexico, home to world-class institutions built on supercomputing like Sandia National Laboratories, Los Alamos National Laboratory, The National Radio Astronomy Observatory, and the Air Force Research Laboratory, the need for HPC engineers far outstrips talent to fill these high-tech roles. This course tackles that gap head-on, empowering students with the practical skills they need to step into these critical positions while creating new economic opportunities in the region. This need provides an opportunity for New Mexico students. New Mexico faces unique economic challenges, including persistent poverty and a lack of access to high-paying jobs in many communities. HPC careers offer an onramp to good careers in government and industry. Students in the class have already gone on to HPC careers at the National Labs and private companies such as Tesla.

The Open Source Stack: Building Clusters from the Ground Up

This course is all about hands-on learning. Students aren’t just studying HPC; they’re building it. Here’s how they do it:

  • Rocky Linux provides the foundation. This community-driven enterprise Linux distribution is perfect for HPC, offering stability and compatibility at no cost.
  • Warewulf automates the heavy lifting. Students learn how to use this provisioning tool to deploy and manage scalable HPC clusters quickly and efficiently.
  • Slurm brings the power of orchestration. Students dive deep into the workload manager that drives some of the world’s largest supercomputers.
  • Apptainer unlocks portability. Students containerize applications, ensuring reproducibility and easy deployment across environments—an essential skill for modern scientific workflows.

This open-source stack doesn’t just lower barriers to entry—it gives students the same tools used by professionals in top research labs and enterprises.

From Benchmarks to Real-World Impact

Every cluster needs to prove its worth, and students learn to measure performance using industry-standard benchmarks. Tools like HPL (High-Performance Linpack) and HPCG (High-Performance Conjugate Gradient) provide critical insights into system performance. HPL tests the cluster’s theoretical peak, while HPCG simulates the workloads common in scientific computing. Together, these benchmarks give students a practical understanding of how to evaluate and optimize their systems.

But it’s not all about performance metrics. Students also deploy containerized AI workflows, gaining experience in how HPC clusters enable complex machine learning pipelines. Whether it’s training large language models or simulating physical systems, students see firsthand how HPC and AI are shaping the future of technology.

HPC in Action: Competing at SC24

The ultimate test of these skills came at SC24, the world’s largest supercomputing conference, where students competed in the Student Cluster Competition (SCC). This event wasn’t just a chance to showcase their technical expertise—it was an opportunity to work under real-world pressure, solving complex computational challenges alongside peers from around the globe.

One of the highlights of the event was meeting Greg Kurtzer, the founder of Rocky Linux, Apptainer (and Singularity before it), and Warewulf. For students who had spent months working with his tools, getting direct guidance and mentorship from one of the leading figures in the HPC community was a transformative experience. This kind of interaction underscored the power of open-source technologies to connect people, ideas, and innovation.

Closing the Gap Between Education and Industry

By the end of the course, students aren’t just familiar with HPC concepts—they’ve built functioning clusters, optimized them with real benchmarks, and applied them to modern AI workflows. This practical approach prepares them for careers as HPC engineers, AI developers, and open-source contributors. At the same time, it helps New Mexico meet its growing demand for local talent in high-performance computing.

With its focus on open-source tools, the course doesn’t just prepare students for the workforce—it aligns them with the global movement towards accessible, community-driven innovation. As AI and HPC continue to evolve, these students will be ready to shape the future of technology.

This kind of hands-on, open-source-focused training is a blueprint for how to equip the next generation of innovators. HPC isn’t just the backbone of AI—it’s the engine driving tomorrow’s breakthroughs. And thanks to tools like Rocky Linux, Warewulf, Slurm, and Apptainer, the future has never been more open.

Related posts

2023 Holiday Gift Guide for Rocky Linux Users

2023 Holiday Gift Guide for Rocky Linux Users

Dec 19, 2023

Rocky Linux

Why Rocky Linux Is a Rock-Solid Choice in an Economic Downturn

Why Rocky Linux Is a Rock-Solid Choice in an Economic Downturn

Jan 18, 2023

Rocky Linux

6 Signs That It's Time to Move to Rocky Linux

6 Signs That It's Time to Move to Rocky Linux

Feb 23, 2023

Rocky Linux

123
54
>>>