Move data in and out of Fuzzball workflows without an external object store

Move data in and out of Fuzzball workflows without an external object store

Contributors

Wolfgang Resch, Research Computing Engineer

Fuzzball 4 includes a built-in object store. It automatically caches data moving in and out of your workflows, cutting redundant transfers to external sources. You can also upload inputs and download results directly all within the same platform.

Repeat runs start from cached data

Iterative HPC and AI pipelines often run the same pipeline against the same reference data repeatedly. The first run pulls that data into the volume, and Fuzzball caches it automatically. Every subsequent run that needs the same input reads the cached copy instead of fetching it again, so repeat runs start faster without any extra configuration.

The cache manages itself. Each cached object has a time-to-live (TTL) property, and Fuzzball purges any objects that go unused for longer than their TTL. The cache stays lean and reflects actual usage by recent workloads.

Bring data in without an external source

You can upload inputs into the cache explicitly and reference them from your workflows. No external object store is required. Your workflows can get everything they need from within Fuzzball.

Uploads work from the CLI or the web interface, whichever fits your needs.

Planning your data flow in Fuzzball? See how the platform handles performance-intensive workloads in the Fuzzball solution brief.

Save results back to the platform

Workflows can write output directly to the cache. Results are available for download, again from the CLI or the web interface, after the workflow completes. No external storage or persistent volumes are needed.

To take it further: A workflow can read inputs from the cache and write results back to the cache, completing full runs without any external storage dependency.

What this removes from your setup

Without the built-in object cache, every workflow run depends on some form of external storage in the form of an external object store or persistent storage volumes served by a shared file system. The external storage has to be provisioned, secured, and maintained to supply data to your workflows over the network.

With the object cache, the first run or an explicit upload brings the data in and Fuzzball holds it. Subsequent runs read it locally. Results come back to the cache for download on demand. The external object store leaves the critical path, and the data movement Fuzzball performs stays visible inside the platform.

Run complete workflows inside one platform boundary

The internal object cache lets you stage inputs, accelerate repeat runs, and retrieve results without an external object store in the loop. Fewer moving parts, faster iteration, and data movement you can track in one place.

See what Fuzzball can do for your workloads. Explore the Fuzzball product page.

Ready to learn more about what CIQ can do for you?

Get in touch

Related posts

AI workflow orchestration: why separate platforms fail

AI workflow orchestration: why separate platforms fail

CIQ 2024 Predictions: The HPC Evolution Boosted by AI and Open Source

CIQ 2024 Predictions: The HPC Evolution Boosted by AI and Open Source

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

CIQ's Partnership with NVIDIA: Transforming Enterprise GPU Infrastructure

CIQ Sponsors 2023 RMACC HPC Symposium

CIQ Sponsors 2023 RMACC HPC Symposium

Built for scale. Chosen by the world’s best.

2.75M+

Rocky Linux instances

Being used world wide

90%

Of fortune 100 companies

Use CIQ supported technologies

250k

Avg. monthly downloads

Rocky Linux