5 min read

Running Nextflow Pipelines on Fuzzball: First Release of the nf-fuzzball Plugin

September 18, 2025
Running Nextflow Pipelines on Fuzzball: First Release of the nf-fuzzball Plugin

Table of contents

A Brief Tutorial: From Setup to ExecutionPrerequisitesInstallation and SetupSubmitting a Nextflow PipelineNextflow pipeline executionWhat's Next: Roadmap and ImprovementsConclusionGet Involved

Contributors

Wolfgang Resch

Subscribe to our newsletter

Subscribe

Nextflow is a widely used workflow orchestration tool in bioinformatics. It enables users to build portable and reproducible workflows based on containers and a plugin interface that allows creation of alternate execution backends.

The nf-core community has grown around nextflow to provide a set of curated and well maintained best-practices workflows for common data types and research scenarios such as RNA-Seq for gene expression analysis, variant calling, genome assembly, and epigenomics. These workflows are data- and performance-intensive and require significant storage and compute resources.

Fuzzball is CIQ's container-first, performance-intensive computing platform that combines workflow execution with infrastructure provisioning across local, cloud, and hybrid environments. We are announcing the initial release of the nf-fuzzball nextflow plugin which allows Fuzzball users to execute the best practices workflows maintained by the nf-core project on Fuzzball-orchestrated resources.

Here we show how to get started with Nextflow on your existing Fuzzball deployment. This approach allows users to run existing Nextflow workflows while leveraging Fuzzball's infrastructure management capabilities.

A Brief Tutorial: From Setup to Execution

Prerequisites

  • A Fuzzball deployment and a local install of the Fuzzball command line interface (CLI)
  • python >= 3.12

Installation and Setup

The current implementation uses a submission script approach that requires minimal setup. The script can be obtained from the nf-fuzzball repository like so:

# Download the submission script from the nf-fuzzball plugin repository
wget https://raw.githubusercontent.com/ctrliq/nf-fuzzball/refs/heads/main/submit/submit_nextflow.py

# Create a Python virtual environment and install dependencies
python3 -m venv nf-fuzzball-env
source nf-fuzzball-env/bin/activate
pip install pyyaml requests

Check the help for the script.

python3 submit_nextflow.py --help
usage: submit_nextflow.py [options] -- <nextflow_cmd>

Submit a nextflow pipeline to Fuzzball.

Notes:
  - Paths for input, workdir, and output in your nextflow command are relative to the
    data volume mounted at /data.
  - Any explicitly specified config and/or parameter files will be included in the
    fuzzball job but implicit files (i.e. $HOME/.nextflow/config and ./nextflow.config)
    will not.
  - config and parameter files should be specified on the commandline directly rather than
    indirectly in a config file.
  - The nextflow command should be specified after the -- separator.
  - Include the fuzzball profile as one of your nextflow profiles

positional arguments:
  nextflow_cmd          Nextflow command

options:
  -h, --help            show this help message and exit
  -c, --context CONTEXT
                        Name of the secret context to use from config.yaml. Defaults to the active context in the config file.
  -v, --verbose         Dump the workflow before submitting and add debug logging
  --fuzzball-config FUZZBALL_CONFIG
                        Path to the fuzzball configuration file. [$HOME/.config/fuzzball/config.yaml]
  -n, --dry-run         Don't submit the workflow, just print it
  --job-name JOB_NAME   Name of the Fuzzball workflow running the nextflow controller job. Defaults to a UUID seeded by the full commandline of the nextflow command.
  --nextflow-work-base NEXTFLOW_WORK_BASE
                        Name of basedirectory for nextflow execution paths. The nextflow execution path will be /data/<nextflow-work-base>/<job-name> which would include logs and the default workdir. [nextflow/executions]
  --nf-fuzzball-version NF_FUZZBALL_VERSION
                        nf-fuzzball plugin version [0.0.1]
  --s3-secret S3_SECRET
                        Reference for fuzzball S3 secret used to pull the nf-fuzzball plugin if the base URI for the plugin download is a S3 URI Defaults to []
  --plugin-base-uri PLUGIN_BASE_URI
                        Base URI for the nf-fuzzball plugin. The submission script expects to find a zip file at <plugin-base-uri>/v<version>/nf-fuzzball-<version>-stable-v<fuzzball-version>.zip. Defaults to [https://github.com/ctrliq/nf-fuzzball/releases/download]
  --nextflow-version NEXTFLOW_VERSION
                        Nextflow version [25.05.0-edge]
  --timelimit TIMELIMIT
                        Timelimit for pipeline job [8h]
  --scratch-volume SCRATCH_VOLUME
                        Ephemeral scratch volume [volume://user/ephemeral]
  --data-volume DATA_VOLUME
                        Persistent data volume [volume://user/persistent]
  --nf-core             Use nf-core conventions
  --queue-size QUEUE_SIZE
                        Queue size for the Fuzzball executor. This is the number of jobs that can be queued at once. [20]

Example:
  submit_nextflow.py -- nextflow run \
      -with-report report.html \
      -with-trace \
      -with-timeline timeline.html \
      hello

Submitting a Nextflow Pipeline

The submission process uses the familiar Nextflow command structure with additional Fuzzball-specific options.

First ensure that you have an active Fuzzball context:

fuzzball context login

Now you can submit a pipeline. In our case, we will use the nf-core/rnaseq workflow with the test data set:

python submit_nextflow.py \
  --job-name "rnaseq-analysis" \
  --data-volume "volume://user/persistent" \
  --scratch-volume "volume://user/ephemeral" \
  --nf-core \
  -- \
  nextflow run nf-core/rnaseq \
  -profile fuzzball,test \
  -resume \
  --outdir /data/nextflow/out/rnaseq-test
2025-08-28 10:08:48 - INFO - Connected to Fuzzball version v2.2.1 API server
2025-08-28 10:08:50 - INFO - Submitted nextflow workflow de9552db-9491-45d5-9a8f-d5ad851ea31e

The initial part of the command up to -- includes options for configuring the Fuzzball execution. The part of the command after -- is a standard nextflow command. Note that the persistent volume will be mounted at /data and paths in the nextflow command should be absolute paths including the /data mount point.

Use --dry-run if you want to see the workflow that would start the nextflow controller process without actually starting the pipeline.

Nextflow pipeline execution

The script above starts a nextflow controller job running as a Fuzzball workflow and does all the necessary configuration:

Nextflow controller job

In the current implementation of the Fuzzball executor Nextflow submits each task (209 in this example) as a single-job Fuzzball workflow with a configurable level of concurrency (20 by default):

Running Nextflow tasks

Fuzzball's web interface provides monitoring of your Nextflow executions and allows access to all task logs for troubleshooting.

When the pipeline run completes the Nextflow controller workflow exits.

Nextflow pipeline complete

In our example, the usual output files (e.g. execution report and the MultiQC report shown in the screenshots below) are saved to the output directory in the persistent storage volume volume://user/persistent and can be downloaded or viewed with one of the Fuzzball applications for file management.

Nextflow pipeline outputs

What's Next: Roadmap and Improvements

While this initial release successfully runs complete nf-core pipelines, we hope to continue improving the executor in the following areas:

  • Efficiency: We would like to reduce the number of separate Fuzzball workflows submitted by the nf-fuzzball executor plugin by batching tasks and/or using task arrays. This should reduce the scheduling overhead for Fuzzball.
  • Storage handling: Better support of remote storage scenarios
  • Fuzzball Workflow Catalog integration: Submit Nextflow pipelines directly from the Fuzzball GUI

Conclusion

By combining Nextflow's workflow orchestration with Fuzzball's resource provisioning, researchers can focus on their science rather than infrastructure management while using established community workflows.

The nf-fuzzball plugin is available now on GitHub with documentation and examples. Though in early development, it can handle production workloads and complex analyses.

Get Involved

Ready to try running your Nextflow pipelines on Fuzzball? Visit the nf-fuzzball GitHub repository for installation instructions, documentation, and examples. We welcome feedback, bug reports, and contributions from the community.

Built for Scale. Chosen by the World’s Best.

1.4M+

Rocky Linux instances

Being used world wide

90%

Of fortune 100 companies

Use CIQ supported technologies

250k

Avg. monthly downloads

Rocky Linux

Related posts

CIQ Fuzzball and Nvidia NIM for Voice-to-Text Processing

CIQ Fuzzball and Nvidia NIM for Voice-to-Text Processing

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

Fuzzball Federate: Unify Complex HPC and AI/ML Jobs Across Cloud and On-Prem Resources

Fuzzball Federate: Unify Complex HPC and AI/ML Jobs Across Cloud and On-Prem Resources

How Fuzzball Transforms HPC: Our New White Paper

How Fuzzball Transforms HPC: Our New White Paper