4 min read
Fuzzball HPC workflows now run natively on Google Cloud

Fuzzball now deploys natively on Google Cloud Platform (GCP), joining AWS and on-prem bare metal as a first-class deployment target. A computational chemist can move a GROMACS simulation from AWS to GCP without changing anything in the workflow. This post shows what that looks like in practice, what GCP brings to high-performance computing (HPC) workloads, and what happens under the hood when a single fuzzball cluster gcp deploy command stands up an entire production-grade cluster.
Write once, run on any cloud
Multi-cloud is a goal most engineering teams share, and one that's genuinely hard to execute. Teams maintain parallel toolchains: Terraform modules for AWS, separate deployment scripts for GCP, different monitoring stacks, and different Identity and Access Management (IAM) models. The "portable workflow" rarely makes it past the planning stage.
Fuzzball takes a different approach. The workflow definition — the file that describes your compute jobs, data movement, container images, and resource requirements — is provider-agnostic by design. Fuzzball's orchestration layer translates that abstract definition into concrete infrastructure on whatever cloud or bare-metal cluster sits underneath.
A computational chemist running GROMACS molecular dynamics simulations can target AWS today and GCP tomorrow. The container images, data ingress/egress paths, job sequencing, and resource requests stay identical across both environments.
The same architecture already supports on-prem deployments on clusters built with Warewulf, VMware, or manually, and cloud deployments on AWS. GCP is the newest target in a provisioning model built to accommodate many.
Deploy a production GKE cluster in minutes
Deploying Fuzzball on GCP starts with fuzzball cluster gcp deploy. What happens next is a two-phase provisioning process that stands up a complete, production-ready cluster without requiring the user to touch the GCP console.
Phase 1 — Bootstrap. The CLI triggers Google Cloud's Infrastructure Manager to create the foundational resources: a service account for the Pulumi runner, a Google Cloud Storage (GCS) staging bucket for state management, and Cloud Key Management Service (KMS) for secret encryption.
Phase 2 — Application infrastructure. Once bootstrap completes, the CLI creates and triggers a Cloud Run Job that executes a Pulumi program provisioning everything Fuzzball needs to operate: a private, regional GKE cluster with Workload Identity enabled; Cloud SQL with multi-zone high availability; Cloud Filestore for workflow I/O and container image caching; networking with private subnets, Cloud NAT, and firewall rules; and the Fuzzball operator itself, deployed via Helm.
Both phases complete without manual intervention. Fuzzball uses the same approach on GCP as it does on AWS, sharing core operator and observability components while using cloud-native resources appropriate to each platform.
If your team is already evaluating multi-cloud HPC infrastructure, request a demo to walk through a deployment in your own GCP project.
Production-grade security you configure once
For practitioners who want to understand what's actually running in their GCP project, here are the key architectural decisions:
Compute. The GKE cluster runs as a private, regional deployment with per-node-pool autoscaling and configurable minimum and maximum node counts. Standard workloads land on n2-standard-8 instances; GPU-accelerated jobs use g2-standard-8 nodes with L4 accelerators. Shielded instances with secure boot are enabled by default. Substrate VMs, the actual compute nodes where your containers execute, are provisioned dynamically based on workflow demand, with configurable boot disk sizes, memory minimums, and GPU image support.
Storage. Two tiers. GCS buckets handle log storage with versioning enabled. Cloud Filestore Enterprise provides configurable NFS volumes for workflow I/O data and container image caching. The NFS setup uses READ_WRITE with NO_ROOT_SQUASH through the Filestore CSI driver.
Database. Cloud SQL PostgreSQL 16 with regional (multi-zone) high availability, private IP only, mandatory Secure Sockets Layer (SSL), 30-day backup retention, and point-in-time recovery with 7-day transaction log retention.
Security. Workload Identity maps Kubernetes service accounts directly to GCP service accounts. No static key files to rotate or leak. Secrets live in Cloud KMS and Secret Manager. The GKE cluster uses private nodes with Cloud NAT for outbound access. Every layer follows least-privilege IAM bindings.
Monitoring. Cloud Monitoring uptime checks on port 443, configurable alert policies, and integration with notification channels for email and webhook alerts.
Portability without the penalty
Practitioners should not need to think about any of that during daily work.
When a Fuzzball user writes a workflow, they define compute requirements in abstract terms: "I need 4 CPUs, 16 GB of memory, and one NVIDIA GPU." They specify container images, data sources, and job dependencies. They don't specify cloud regions, instance types, or storage backends. Fuzzball's scheduler and provisioner translate those abstract requirements into concrete cloud resources at runtime.
The hard part of multi-cloud is the gap between "we support it" and "I moved my pipeline in five minutes." Fuzzball closes that gap. A genomics researcher who developed and validated a sequencing pipeline on an AWS-hosted Fuzzball cluster can run it on GCP without modifying the workflow file. Fuzzball is container-first and supports both Docker and Apptainer images, so the containers, data orchestration, and job sequencing all carry over.
For organizations running Fuzzball Federate (the layer that brokers workloads across multiple Orchestrate clusters), GCP support opens a new dimension. A federated deployment can now span an on-prem cluster, an AWS deployment, and a GCP deployment simultaneously. Federate’s scheduling routes jobs to whichever environment offers the best combination of cost, performance, and data locality: a training job that needs H100s lands on GCP's A3 instances, and a data-sensitive simulation stays on-prem.
The workflow author hits run. Infrastructure policies handle routing.
One identity model across AWS, GCP, and bare metal
Maintaining consistent security across clouds is a real challenge. AWS IAM and GCP IAM are different systems with different permission models, which often means organizations build separate access models and compliance workflows for each provider.
Fuzzball addresses this by implementing its own identity and access management layer on top of cloud-native primitives. Role-based access control governs who can submit workflows, access data, and manage clusters, regardless of which cloud hosts the cluster.
On GCP specifically, the deployment uses Workload Identity to eliminate static credentials, Cloud KMS for encryption key management, and Secret Manager for storing sensitive configuration. These are GCP-native security services, wired into Fuzzball's unified security model automatically during deployment.
Deploy your first Fuzzball workflow on GCP
Fuzzball's GCP deployment is available now. If your team is evaluating cloud HPC infrastructure — or if you're already running Fuzzball on AWS and want to extend to Google Cloud — talk to the CIQ team to see it in action, or request a demo to walk through a deployment in your own GCP project.
Built for scale. Chosen by the world’s best.
2.75M+
Rocky Linux instances
Being used world wide
90%
Of fortune 100 companies
Use CIQ supported technologies
250k
Avg. monthly downloads
Rocky Linux



