
Run Fuzzball HPC workflows natively on Oracle Cloud
Contributors
Chris Wolford, Director of Engineering
Fuzzball now deploys natively on Oracle Cloud Infrastructure (OCI), joining AWS, GCP, and on-prem bare metal as a first-class deployment target. A computational chemist can move a GROMACS simulation from GCP to OCI without changing anything in the workflow. This post shows what that looks like in practice, what OCI brings to high-performance computing (HPC) workloads, and what happens under the hood when a single fuzzball cluster oci deploy command stands up an entire production-grade cluster.
Write once, run on any cloud
Multi-cloud is a goal most engineering teams share, and one that's genuinely hard to execute. Teams maintain parallel toolchains: Terraform modules for AWS, separate deployment scripts for GCP, OCI Resource Manager templates, different monitoring stacks, and different Identity and Access Management (IAM) models. The "portable workflow" rarely makes it past the planning stage.
Fuzzball takes a different approach. The workflow definition (the file that describes your compute jobs, data movement, container images, and resource requirements) is provider-agnostic by design. Fuzzball's orchestration layer translates that abstract definition into concrete infrastructure on whatever cloud or bare-metal cluster sits underneath.
A computational chemist running GROMACS molecular dynamics simulations can target GCP today and OCI tomorrow. The container images, data ingress/egress paths, job sequencing, and resource requests stay identical across both environments.
The same architecture already supports on-prem deployments on clusters built with Warewulf, VMware, or manually, and cloud deployments on AWS and GCP. OCI is the newest target in a provisioning model built to accommodate many.
Deploy a production OKE cluster in minutes
Deploying Fuzzball on OCI starts with fuzzball cluster oci deploy. What happens next is a two-phase provisioning process that stands up a complete, production-ready cluster without requiring the user to touch the OCI console.
Phase 1: Bootstrap. The CLI orchestrates bootstrap: ORM provisions an Object Storage bucket for Pulumi state and a transient runner network; the CLI creates Dynamic Groups and IAM policies via the Identity API and pushes the Pulumi runner image to OCIR.
Phase 2: Application infrastructure. Once bootstrap completes, the CLI executes a Pulumi program provisioning everything Fuzzball needs to operate: a private, regional Oracle Container Engine for Kubernetes (OKE) cluster; a managed PostgreSQL database with high availability; OCI File Storage Service (FSS) for workflow I/O and container image caching; a Virtual Cloud Network (VCN) with private subnets, NAT Gateway, and network security groups; and the Fuzzball operator itself, deployed via Helm.
Both phases complete without manual intervention. Fuzzball uses the same approach on OCI as it does on AWS and GCP, sharing core operator and observability components while using cloud-native resources appropriate to each platform.
If your team is already evaluating multi-cloud HPC infrastructure, request a demo to walk through a deployment in your own OCI tenancy.
Production-grade security you configure once
For practitioners who want to understand what's actually running in their OCI tenancy, here are the key architectural decisions.
Compute. The OKE cluster runs as a private, regional deployment with per-node-pool autoscaling and configurable minimum and maximum node counts. Standard workloads land on VM.Standard.E4.Flex shapes; GPU-accelerated jobs run on VM.GPU.A10 or VM.GPU.A100 shapes with NVIDIA GPUs. Substrate VMs, the actual compute nodes where your containers execute, are provisioned dynamically based on workflow demand, with configurable boot volume sizes, memory minimums, and GPU image support.
Storage. Two tiers. OCI Object Storage handles log storage with versioning enabled. OCI File Storage Service provides configurable NFS volumes for workflow I/O data and container image caching, mounted into the cluster through the FSS CSI driver.
Database. Managed PostgreSQL with regional high availability, private network access only, mandatory Transport Layer Security (TLS), automated backup retention, and point-in-time recovery.
Security. OCI Identity Dynamic Groups and IAM policies map workload identities directly to OCI principals. No static key files to rotate or leak. Secrets are managed by Kubernetes. The OKE cluster uses private worker nodes with NAT Gateway for outbound access. Every layer follows least-privilege IAM bindings.
Portability without the penalty
Practitioners should not need to think about any of that during daily work.
When a Fuzzball user writes a workflow, they define compute requirements in abstract terms: "I need 4 CPUs, 16 GB of memory, and one NVIDIA GPU." They specify container images, data sources, and job dependencies. They don't specify cloud regions, instance shapes, or storage backends. Fuzzball's scheduler and provisioner translate those abstract requirements into concrete cloud resources at runtime.
The hard part of multi-cloud is the gap between "we support it" and "I moved my pipeline in five minutes." Fuzzball closes that gap. A genomics researcher who developed and validated a sequencing pipeline on a GCP-hosted Fuzzball cluster can run it on OCI without modifying the workflow file. Fuzzball is container-first and supports both Docker and Apptainer images, so the containers, data orchestration, and job sequencing all carry over.
For organizations running Fuzzball Federate (the layer that brokers workloads across multiple Orchestrate clusters), OCI support opens a new dimension. A federated deployment can now span an on-prem cluster, an AWS deployment, a GCP deployment, and an OCI deployment simultaneously. Federate's scheduling routes jobs to whichever environment offers the best combination of cost, performance, and data locality: a training job that needs A100s lands on OCI's GPU shapes, and a data-sensitive simulation stays on-prem.
The workflow author hits run. Infrastructure policies handle routing.
One identity model across AWS, GCP, OCI, and bare metal
Maintaining consistent security across clouds is a real challenge. AWS IAM, Google Cloud IAM, and OCI Identity are different systems with different permission models, which often means organizations build separate access models and compliance workflows for each provider.
Fuzzball addresses this by implementing its own identity and access management layer on top of cloud-native primitives. Role-based access control governs who can submit workflows, access data, and manage clusters, regardless of which cloud hosts the cluster. On OCI specifically, the deployment uses Dynamic Groups and IAM policies to eliminate static credentials, Kubernetes secrets for encryption key management, and TLS encryption for transmitting sensitive configuration. These are OCI-native security services, wired into Fuzzball's unified security model automatically during deployment.
Deploy your first Fuzzball workflow on OCI
Fuzzball's OCI deployment is available now. If your team is evaluating cloud HPC infrastructure, or if you're already running Fuzzball on AWS or GCP and want to extend to Oracle Cloud, talk to the CIQ team to see it in action or request a demo to walk through a deployment in your own OCI tenancy.
Built for scale. Chosen by the world’s best.
2.75M+
Rocky Linux instances
Being used world wide
90%
Of fortune 100 companies
Use CIQ supported technologies
250k
Avg. monthly downloads
Rocky Linux



