Run Fuzzball workflows on your existing Lustre, GPFS, and BeeGFS storage

Run Fuzzball workflows on your existing Lustre, GPFS, and BeeGFS storage

Contributors

Wolfgang Resch, Research Computing Engineer

Your researchers keep their data on the parallel file system you already run. With Fuzzball volume provisioners, your containerized workflows read and write that data in place. Lustre, GPFS, BeeGFS, home directories, and project storage attach to Fuzzball jobs without copying data into a separate system and without rearchitecting the storage you depend on.

This is the change HPC sites consistently asked for. Earlier versions of Fuzzball handled storage through a storage class abstraction that worked but made it difficult for Fuzzball workflows to process the large amounts of data on existing storage systems. Volume provisioners solve this problem.

Fuzzball volume provisioners let any host-mounted file system serve data directly to your workflows, so your existing Lustre, GPFS, and BeeGFS deployments become first-class storage for containerized HPC.

Your file systems attach as they are

A volume provisioner exposes host-mounted storage directly to Fuzzball jobs without a translation layer or re-ingestion. Whatever your nodes already mount works: Lustre, GPFS, BeeGFS, home directories, project shares. If it's on the host, Fuzzball can use it.

Volumes are either persistent (data survives the workflow) or ephemeral (scoped to a single job, optionally on node-local storage to keep scratch I/O close to compute). You declare which each step needs and let Fuzzball handle the rest.

You can also attach storage managed entirely outside Fuzzball allowing your existing tooling to stay in control, while allowing Fuzzball to consume it for the duration of a workflow run.

Request storage by name or properties

Volume provisioners carry annotations describing properties of their underlying storage, and workflows request storage by specifying the required properties. Ask for what you need, and Fuzzball finds a matching provisioner. For example, your jobs might need a node-local high performance scratch space plus a large shared scratch space that is not performance sensitive.

Conversely, persistent volumes can also be referenced explicitly by its name alone as long as the name is unique across the provisioners queried. In other words, the backing storage can live anywhere, or move, without touching the workflows that depend on it.

Mapping your storage to Fuzzball? See how the platform fits performance-intensive workloads in the Fuzzball solution brief.

Sharing and permissions match how your site already works

A single persistent volume can be shared across your whole organization or scoped to a group. Shared reference data, common datasets, and team project space each get the visibility you intend.

Fuzzball enforces POSIX permissions inside jobs. The user and group identity that governs access on your file system governs access inside the workflow. This carries through to LDAP and Active Directory environments, so the multi-user and multi-group authorization your site already maintains continues to apply when jobs run under Fuzzball. Researchers access the data their user and group identity permits, the same as on the underlying file system.

What this looks like for your site

Here is the practical sequence for a team adopting Fuzzball against existing storage.

You point a volume provisioner at the file system you already mount, whether that is a Lustre scratch space or a GPFS project tree. You annotate provisioners so workflow authors can request storage by intent rather than by path. Your users then write workflows that reference persistent volumes by name and request ephemeral, node-local scratch where their steps need fast temporary I/O. Throughout, POSIX permissions and your directory service keep access correct.

The data never leaves the file system your site administers. The workflows stay portable across the storage behind them.

Move your HPC workflows onto the storage you already trust

Volume provisioners let you adopt containerized, orchestrated HPC with easy access to your existing data using the same permissions. In addition, it also opens up new ways to request storage for workflows by storage properties instead of simple names or paths.

See what Fuzzball can do for your cluster. Visit the Fuzzball product page.

Ready to learn more about what CIQ can do for you?

Get in touch

Related posts

AI workflow orchestration: why separate platforms fail

AI workflow orchestration: why separate platforms fail

CIQ 2024 Predictions: The HPC Evolution Boosted by AI and Open Source

CIQ 2024 Predictions: The HPC Evolution Boosted by AI and Open Source

CIQ Fuzzball and Nvidia NIM for Voice-to-Text Processing

CIQ Fuzzball and Nvidia NIM for Voice-to-Text Processing

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

Built for scale. Chosen by the world’s best.

2.75M+

Rocky Linux instances

Being used world wide

90%

Of fortune 100 companies

Use CIQ supported technologies

250k

Avg. monthly downloads

Rocky Linux