Fuzzball (HPC 2.0): Workflows & Solutions for HPC Use-Cases

April 14, 2022

About Fuzzball: HPC-2.0
“HPC2.0 - The Next Generation of High Performance Computing”

Imagine a computing environment so powerful that it can orchestrate workflows, services, and data while maintaining supply chain integrity from on premise, to cloud, and to the edge. Including the ability to support multiple systems and multiple clouds, federated into a virtual cloud, where every workload lands based on architecture availability, cost, and data management policies.

This is Fuzzball

Integrate multiple HPC resources into One:
Geographically dispersed on-premise supercomputers, Fuzzball is designed as a cloud native and hybrid, federated computing platform for unification of geographically distributed HPC instances whether on-premise or cloud.

Meta scheduling & orchestration across hybrid resources:
Orchestration across architectures and resources.
Scheduling based on cost of compute, data & availability of compute.
Data (including: Locality, Mobility, Gravity, and Security).

Cloud-based HPC clusters and nodes:
HPC resources can be on-premise, multi-premise, cloud, multi-cloud, and federated.
Cloud based resources are elastic based on jobs, resource policies & data.
Our platform supports all major clouds natively, Kubernetes, as well as custom cloud resources.

A unified workload and resource management platform:
Unifying the end-user and administrator experience no matter where you utilize the platform is a design principle. A user interface (UI) application for the submission, tracking, and management of HPC jobs and workflows which is completely API driven with a command line interface as well as a GUI interface. This provides monitoring and management of all HPC resources leveraging standard enterprise monitoring and management of all resources in our platform.

“Fuzzball is readily capable of expanding to integrate new physical or cloud-based HPC assets down to the node level, even the component level. Fuzzball can be easily enhanced with FPGAs and GPUs or the newest network protocols to take advantage of enhancements as they come to market. No more being locked in by anyone or anything.”

Webinar Synopsis:


  • Zane Hamilton, Vice President of Sales Engineering, CIQ

  • Gregory Kurtzer, Founder of Rocky Linux, Singularity/Apptainer, Warewulf, CentOS, and CEO of CIQ

  • Forrest Burt, High Performance Computing Systems Engineer, CIQ

  • Robert Adolph, Chief Marketing Officer, CIQ

Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.

Full Webinar Transcript:

Zane Hamilton:

Good morning, good afternoon, and good evening. Welcome to another webcast with CIQ. For those of you who are new to us, we welcome you and ask that you go ahead and like and subscribe. For those who've been with us before, thank you very much for joining again. Today we have an interesting topic; we're going to dive deeper into Fuzzball. I know we've talked about it a little bit in the past, but I'd like to get a little bit deeper into it today. So I have Greg, Robert, and Forrest is back with us again. Welcome.

Gregory Kurtzer:


Forrest Burt:

Hello everyone.

Zane Hamilton:

If you don't mind–I think everybody knows the three of you guys, so I don't know if we need to go through and do formal introductions again–but quickly, if we would like to, we'll start with Greg.

Gregory Kurtzer:

Hi, I am Greg. I've been doing high performance computing for a while and I have been lucky to be part of a bunch of open source projects over the years and tenure of my HPC career.

Zane Hamilton:

That's awesome. Robert.

Robert Adolph:

I'm Robert. And like Greg, I am happy to be here.

Zane Hamilton:

Last but not least, Forrest.

Forrest Burt:

Hey everyone, I'm Forrest. I'm an HPC systems engineer here at CIQ, I do a lot with the user side of fuzzball and represent different high performance use cases with it. And I'm very excited to be here as well. So thank you for having me on.

Zane Hamilton:

Thank you for coming again. So one of the things I wanted to dive into real quick, Greg, I know we talked about this last time a little bit about what fuzzball is. But if we could talk about high-level, what fuzzball is, and then I want to get into the architecture of it a little bit before Forrest jumps into his use case examples, if you don't mind.

Recap on Fuzzball [01:36]

Gregory Kurtzer:

Oh, I'd be happy to. I really love talking about fuzzball because we've been working on it for the last couple years. And, what it represents is a massive amount of innovation that we've been driving for high performance computing. What really started this is in my mind was the advent of container computing in HPC, about five, six years ago, containers was all the rage and we saw, this moving from the enterprise and cloud world into the high performance computing realm being driven by scientist researchers and use cases that can really benefit from the leverage of containers. And that's what started all of this. What we found was the container ecosystem as it currently existed in the cloud and hyperscale just didn't fit in HPC. How we do things, how we build our systems, our architectures are just fundamentally different from these cloud native architectures.

And before that, we knew we were different. We had a different architecture. This architecture is a 28 year old architecture originally coined as the be Wolf and pretty much every system as of today still leverages this same fundamental base architecture. And that architecture has been tremendous in terms of, scaling and supporting the scientific needs and use cases that we needed. But containers was really the first major leap where all of a sudden there was a blocker, there was an incompatibility and we wanted what enterprise and cloud was doing. We wanted to go in that direction, but we didn't really have a good way of getting there. I created a project called singularity. We've recently moved singularity into the Linux foundation and we renamed the project into Retainer and it is a thriving open source community.

There's singularity and now Retainer systems and clusters all over the worldand it's just been a tremendous resource for HPC to be able to leverage these technologies, but I referenced containers like Pandora's box with when we think of high performance computing, because in HPC we never really, we knew we were different in terms of the rest of the ecosystem and how they were solving problems, but it never really mattered to us because we knew how to solve our problems. We knew what we needed to do, and we were very good at it. But now all of a sudden we're seeing this drive to modernize and to start to shift and to start leveraging more of the innovations coming out of cloud, enterprise and hyperscalers. And how do we do that? How do we start thinking about HPC as an API driven, consumable federated hybrid architecture? How do we start building that? How do we start putting that all together and still doing what we do best with regards to tightly coupled application performance compute, and how do we do this? How do we make decent use of cloud resources as well as on-prem resources? How do we make good use of, on-prem resources and different on-prem resources? And so I have some slides because that's what I've learned CEOs do.

Zane Hamilton:

So why don't you pull those up, Greg, the difference between the two different types of workloads. When you talk about enterprise, you talk about HPC, they're different because of the types of work they do, right? So you're looking in the enterprises, moving towards microservices type technologies, and HPC still has very large, more batch type work. Is that the reason there was a split there?

Gregory Kurtzer:

It's really interesting because, from most perspectives, an application is an application there's not nothing fundamentally different about a web service versus in high performance computing application, but a web service and a database service and more enterprise focused and background, in terms of what they've been doing. I mean, most of it is they're, they're running services, which are, sitting in a select loop and basically their whole lifetime is mostly idle, just waiting for something to happen. Even a very busy web service is mostly at an idle state waiting for some sort of interrupt to happen on a select loop, right? So it's waiting. So as a result of that, it's not consuming a lot of resources. So you can over subscribe these systems considerably, the current container and microservice architecture has been designed around this notion of over subscription of resources, because most of the time stuff is idle in high performance computing.

If we have a portion of an algorithm that's sitting idle, you probably want to get a different programmer. You don't want an idol happening. You want everything spinning at a hundred percent and balanced across all of the different resources, all the different system buses, network storage, IO, memory bandwidth you want to optimize your, your applications and your algorithms such that they are consuming the system in a very aggressive and balanced way. So you're going to end up bottlenecking on something. Usually it's CPU, floating point operations, that sort of thing. Sometimes it's going to be memory bandwidth, and other times it's going to be IO. But you want to be balancing that and driving it a hundred percent now, orchestration systems and scheduling systems have to think about this in a completely different way, from a service that's sitting mostly idle.

So from a high-performance computing perspective, we've developed, and we've been building these systems that are very focused on these tightly coupled, highly paralleled applications in solving–optimizing that for as big of a system as we can possibly get, and trying to scale that as big as we possibly can. In some ways it's been great because we've been so successful that we've got giant systems; the biggest system–the Japanese Fugaku system–I think last I heard it's 150,000 systems. Now these are custom systems, so they're not like full X 86 pizza boxes or blades, right? These are custom systems, but that's a lot of systems to be thinking about. And you imagine how many cores in each one of those, you have a giant resource. Building applications and being able to scale that again, that traditional be Wolf style architecture has been very advantageous for that.

And I'm going to go back to one of the points that you made; enterprise computing workloads are all of a sudden starting to be more like HPC workflows. We're starting to see a crossover, and this crossover has been really exacerbated by AI/ML, compute, and data driven analytics, among other enterprise focused compute driven workflows and applications. We've just been seeing more and more of those and all of a sudden enterprises that have never considered HPC before starting to look at HPC like, "Hey, that's pretty cool. We're needing some of that stuff right about now." The problem is the architecture that we've been using for the last almost 30 years is completely incompatible with how enterprises typically work and how clouds typically work.

So there's a couple different modes of operation that's happening here. One is that some people are taking traditional HPC and trying to shoehorn it and force it into an enterprise or a cloud environment. On the other side of the equation, you're seeing people that are using the tools that they're familiar with enterprise and with virtualization and Kubernetes and other facets and they're trying to do these sorts of workloads. On both sides, we're getting mixed results. The proper answer here is really, "let's not try to shoehorn the not perfect solution. Let's reevaluate." We've been doing HPC this way, for as I said, 28 years. In computing terms that's like prehistoric we need to figure out how the next innovation is going to look? What is HPC 2.0 and what is that going to operate like? And how are we going to be able to even further the types of jobs, the diversity of jobs, the throughput of jobs, velocity, how are we going to do smarter things? And how are we going to further enable scientists, researchers, and enterprises to be even more capable than what we've been able to do? How do we lower that barrier of entry? How do we support larger infrastructures? How do we bring these infrastructures together? And thus, now I'm going to jump into my slide because I think I've yammered on long enough.

Zane Hamilton:

That's perfect. Thank you.

Gregory Kurtzer:

What I'm showing here is the highest level view of fuzzball, and fuzzball–which is our HBC 2.0 stack–is built out of three major components. This is the top level component, which is federate. As a user's workflow will come in–as you can see on the top left–it'll come in via APIs. The whole system is API based and IM governed. So as that workflow comes into fuzzball federate, fuzzball federate is going to be federating between different fuzzball clusters. And I'll talk about the architecture or fuzzball cluster in the next slide. So basically it will figure out what workflows and jobs need to run based on policies that the organization would set. For example: architecture and resource availability, cost and data. And when we think of data, we're thinking of things like data locality; where does the data exist? Data mobility; how do we move that data? Gravity; is the data getting so big that the jobs come to the data versus the data going to the jobs and data security models?

And so we're basically thinking about this from all these different perspectives and those policies, that federate, drive and allow us to land that workflow. Maybe up in the cloud, maybe in different availability regions in the cloud, maybe in a different cloud. We can go not only cloud, but we can go multi-cloud. Maybe it's going to land on-prem, maybe one of many prems. And it gives us a massive amount of flexibility. And because the system is completely API based, it gives us the ability to have different interfaces to it. Different capabilities wrapped around this integrate with CICD directly and other things. It also gives us the ability–and I alluded to this when I talked about the data side of this–we're not only orchestrating the workflow and the jobs we can orchestrate the data and make sure that the data's going in the right place and adhering to various data security policies and data management policies.

We bring all of this together. That's what fuzzball federate does. Now, if we dive down a little bit and we look at an individual cluster, this is what that looks like on the left. You're going to see we have a management cluster, and this is running Kubernetes because fuzzball orchestrate–the second piece of fuzzball–is a microservice architecture. So we're going to want to run it on a microservice control plane like Kubernetes. We've got different services that run in fuzzball orchestrate. So when a user's workflow comes in–and you'll notice it's the same workflow image that I used on the previous slide, because it's not only the same workflow, it's the same API–so a user can submit to a federate cluster and go multi-cloud go anywhere, or they can go to a single cluster. So it makes it very easy for a user to be able to change, using that same API, where are we running jobs?

Where is the Application being run? [14:23]

Where are we running these applications? So as that workflow comes in, whether it's coming in through federate or it's coming in through a user connecting directly to this cluster, we're basically going to do things and you can see on the right side of the page, right? We're going to do ingress from a data lake, some sort of object storage that exists. We're going to pull data in. We're going to create volumes on top of that. Excuse me, we're going to create volumes. And then on top of the volumes, we're going to run jobs and run our job pipeline. And at the very end we're going to egress data back out of that volume. And depending on the type of volume it is we may blow away that volume and cash all of the artifacts. We can cash things like the ingress pieces that we leverage.

So if another workflow requires those same bits of data coming in for ingress, we have that already available. When you think about this from a federate perspective, it means that we can make very intelligent and logical decisions as to where a subsequent job requiring similar data lands. It's almost like we have a breadcrumb trail on all the different pathways following the data, and we can follow that for subsequent workloads that again, require that same sort of data. Now the last slide I'm going to show, and then I'm going to stop talking and hand this over to Forrest so we can basically see this actually working in an operation with several different types of workflows. But the last thing I'm going to talk about is, what does that workflow look like?

And to do all of what we just described and talked about, there's three major pieces of the puzzle that we have to build. And we put these together into a composable configuration, and I say composable for two reasons. One, we're actually composing that HPC environment that we're running this job in. Two, we can actually also compose the underlying resources necessary to satisfy the requirements of this workflow. We use this to compose everything. So the first thing we need is we need to know the data and we need to know how to manage that data. We have data and we have volumes, and you can see on the right side–it's super small, but if you make your screen really big you might be able to see it–we have a volume section. And in that volume section, we're defining a V1 and it's an ephemeral volume, which Forrest will talk a little bit more about in just a moment.

Ingress Data and Egress Data [16:58]

I have ingress data we're going to bring in before the job starts, and we have egress, data we're going to ship out once a job pipeline is done. Then we've got our job pipeline, it defines what the compute looks like in an acyclic graph for the entire workflow. Every piece of that pipeline runs inside of a container and we can leverage several different container formats in order to do this. So there's Docker, there's OCI, there's singularity and container. We can leverage all of these container formats natively. Basically we have this job pipeline and that's what defines our compute. Every bit of that job pipeline exists inside of a container. Even our MPI, we actually have our own wire up that we've built out specifically for different implementations of MPI. So open MPI and MPitch, for example. And lastly, we have to know what resources we need.

And when I talk about resources I'm not just talking about basic architecture, but also do we need GPS? Do we need FPGAs? What kind of interconnect do we need? Do we have microarchitecture optimizations that we need to be considering? Was the container built for a certain processor generation that we want to make sure that we've optimized for and we want to run it on that? We can start doing that matchmaking between the jobs, the workflow, and the right types of hardware. And the last point I'm going to mention is, we can do this–and you saw this previously on the slide–we can do this both on-prem and in the cloud. And when we do it in the cloud, we're actually composing that cluster based on a particular job or workflow that comes in.

So as soon as this workflow comes in, we figure out what resources have been configured for this application of this workflow to run. And we go out and we provision that over on the cloud. So we grow that cloud cluster on demand then we shrink it when we're all done to the point where the only thing left running in the cloud is a tiny little fuzzball orchestrate instance that runs in a cloud-based Kubernetes. And it's super simple, super lightweight, and it's extraordinarily easy to use, and configurable. So on that point, I'm going to stop sharing, mute my mic and hand it over to Forrest.

Fuzzball WorkFlow Demo: Lamps and GrowMax [19:19]

Forrest Burt:

Sounds good. So we've got a couple of fuzzball workflow demos that illustrate some of this to show off here today. We're going to be looking at lamps and GrowMax which are two molecular dynamic suites that are commonly used in HPC. We're going to be looking at both of those running over some combination of multiple nodes and multiple GPUs. With lamps, we're going to be taking a look at something that's run over four nodes. Then with GrowMax, we're going to be taking a look not at something running directly, but at some other aspects of our stack that allow you to do management of workflows and things like that, but it'll also be illustrative of a GrowMax brond essentially happening within fuzzball. Then we're going to take a look also at TensorFlow and a little sneak peek into what the workflow life cycle actually looks like. What does it take to actually bring a workflow from concept to something that we can use to do meaningful work for us in fuzzball? So as I mentioned, the first thing I have is lamps, this is an exclusive view of something that'll be coming out on our YouTube. So I have this as a video file here that I will go ahead and select.

Then we'll wait for that to come up and be shown. There we go. Okay, cool. Let me make sure this doesn't blow everyone's eardrums out. So this will be a quick video. This is going to show off lamps being run, as I described, over four nodes at once through our MPI wrapper where each one of those nodes is going to have one GP on it. So essentially this will be four GPUs at once with MPI. As I will reiterate here in this video in about two seconds. So this video is about eight minutes or so. Like I said, we'll hop over to looking at some other aspects of the system and how GrowMax works. And like I said, an example with TensorFlow. So without further ado, we'll go ahead and kick this off.

This is Forrest Burt from CIQ, and thank you for joining this fuzzball workflow demo. So in this demo, we're going to be looking at the lamp's molecular dynamic suite. We'll be running lamps over four discrete compute nodes at once, where each one of those compute nodes will add a single GPU attached to it. So essentially we'll be running over four GPS at once with MPI. So really quickly, I'll go over this workflow definition that we have here so we can know what it's doing. Up here at the top, we have the version field, this describes the version of our DSL that this workflow is written in. We then have the volumes field and this section, kind of delineates the data volumes that should be set up for this workflow.

We're only going to set up one volume and it's a type ephemeral. An ephemeral volume is–oh, sorry, everyone. I paused it for a moment to explain something, but forgot to unmute myself. I just want to point out really quickly because I don't say this, just to make things clear, V1 here is the name of this workflow volume. So here in this volumes field, you'll see that we described this one data volume. It says V one, right? They're under that volumes line. That's the name of the volume. You'll also see down at the bottom by where my command line cursor actually is, there's a slash data location, and then right above that is V1. Again, that refers to the name of the volume that I set up here in this volumes field at the top–one that's created when the workflow starts, the workflow interacts with it by ingressing data into it.

Jobs can persist data to other jobs with it, and then at the end of the workflow any egresses that are supposed to move data out of it will be triggered, and then once the workflow is finished the ephemeral volume is essentially destroyed. So in this case, with our ephemeral volume we're going to be ingressing one file from the internet and then we're placing it at the top level of the state of volume with this name. After the volume section, we have the workflow jobs section. In the case of this workflow, we only have one job and that's run lamps. Run lamps is an MPI job that we're going to be running over four nodes with an open MPI implementation of our MPI wrapper. The image that it's going to use is being pulled from the NGC, and you can see it's basically just an Nvidia GPU optimized version of lamps from there.

The command is basically just running lamps on one GPU with some parameters and some other information set up there. This is just setting the current working directory that this command is going to be run in. So this will be run in /data, which I'll explain a little bit more about in a second. We have a couple environmental variables that we're setting. These are just changing how open MPI is going to look at a couple things in this, it's not too terribly important to explain what they're doing. They're just essentially making sure that this works right and uses the correct communication planes and the correct process management type programs.

And that's more so set up on the open MPI side. These are MCA parameters, so you can see how you can provide a workflow, even customizations for the MPI and stuff that's going to be running there–so those are basically some open MPI options that we're setting. Then we have the resources section, which specifies the amount of CPUs, the amount of memory, and the amount of devices that we want for this workflow. In this case, we're going to be looking for two cores with a NUMA affinity. With two cores there's not too much that can be done to take advantage of NUMA nodes on a server, but NUMA is essentially a memory architecture where some cores on the CPU can access some portions of memory faster than others.

So it becomes efficient to, for example: if you're only using part of that CPU, land all of those cores on the same NUMA node. Like I said, in this case with two, it's debatable whether or not it helps a ton, but you can see how if you were using more CPU cores, that would be an easy way to instantly add some efficiency if the compute nodes that you have can take it or have a underlying non-uniform memory architecture essentially. We're then going to specify that we want 14 GB of Ram for this workflow to run with. And then we're going to specify that we want one Nvidia GPU. Then down here at the bottom we're going to attach the data volume that we created up here to this workflow job; so mounts and then the name of the data volume, and the location inside of the container that we want this data volume be mounted to.

So in this case, we will put this at /data in the container. So when this file is downloaded you'll see that it'll end up at /data slash this file. Which is why here, when we specify the input file for lamps, we do /data slash that. So we'll go ahead and run this–and really quickly just to make it clear, these resource requirements are going to be searched for per node. So each one of these nodes will have at least two cores, at least 14 GB of memory and a GPU available on it. So in total, we'll end up with eight cores, 56 GB of Ram and four GPUs across those four nodes and then we'll watch it spin up some nodes, which will just look like some of the statuses changing for the different components of the workflow.

And then once that's run and the workflow's finished, we'll take a look at the results and see how it ran on four GPUs. So we'll just do, fuzzball workflow start users/–I'm essentially just specifying that I want to start a workflow and then I'm just giving it a name–so we'll do lamps-gpu-ngc,yaml. So essentially we just have: fuzzball workflow start, a name for this workflow so you can see this is basically just users slash my account slash a custom name, and then I'm just specifying the path to the workflow YAML that this is going to pull from. And since we're in the same directory, we can basically not really have to specify a path. You can see once we start that we get this message saying that the workflow started we'll go ahead and do: fuzzball workflow status --watch.

And then we will go ahead and take a look at what we see here. You can see, we have five different components over here. Workflow here at the top essentially just represents the state of the workflow as a whole. So it started, this is the time it started at. The data volume for the workflow, is as we mentioned, the data volume that's created up here, we can see that's already finished creating. The file here is the ingress that we're doing right here. So you can see that that's started and that file is currently being downloaded into the data volume. The image would be being downloaded but it's probably already cashed, which is why we see it basically finishing instantly. Or, oh, maybe we don't. Oh, no, yes. Yeah, that is instant. Sorry, read that wrong.

So we see that image basically downloads instantly because it's already cashed. Otherwise it might take a few minutes to download, and now we can see that all the parts of the workflow finished except for the job itself that this is going to run. All the preset up for this job to run has been done, and now we're just waiting for the resources to be spun up and allocated to it. So this will remain as pending until that happens and this video will probably jump cut to the point at which this starts and finishes. So I'll be back then to look over and analyze the results and explain how our workflow ended up. So I'll see you then.

All right. Hello again, everyone. This workflow's completed running, so we'll go ahead and take a look at the results of it. We'll do a quick plus ball workflow log. We don't really need a -f I suppose, do fuzzball workflow log and then provide the name of the workflow, we'll get our logs out here. We can see that we have right here 8,400 about tau/day. When you run, this particular workflow on just one of these particular type of compute node, with a particular type of GPU that they have, we get about 2200 tau/day. So considering that we've probably lost a little bit of time in communication inner node communication, MPI communication, basically, this is probably about what we expect to get about 8,400 tau/day. So that looks good. We can see we have 8 MPI tasks, which requesting two cores across four nodes gives us eight. So that makes sense.

And then up here, we have a couple of warning messages, because I don't believe I specified the exact version of lamps inside of this container for it to use. So it's conflicting compute capabilities, but it's not something we're particularly concerned about at the moment. Essentially we can see a few of these messages, we have a two by one by four MPI processor grid, which is once again, 8 MPI tasks. And then we have a few of these messages here that are being printed as it's detecting GPUs and setting things up. If I'm correct, we should have 1, 2, 3, 4, 5, 6, 7, 8. Yep, 8 of those warnings, which basically just means that each one of the 8 MPI tasks produce that. But like I said, that one doesn't matter as much because we got the performance we expected, which is about 8,400 tau/day, so we're good to go. It looks like this workflow is run successfully and we have seen lamps run over four GPUs at once, using MPI. So once again, thank you all for tuning in. I hope you enjoyed this demo and we'll see you in the next one.

Okay. So that's lamps. Would we mind bringing that up really quick? That video one more time? I just want to pause something here at the end and point something out.

You'll notice that we also have up here at the top, just to further point out it indicating usage of GPUs, you can see that we have this caucus mode enabled. Which is like a process management C++ optimization library of sorts that involves GPUs. And you can see that saying the mode enabling that is enabled, and it'll use up to one GPU per node. So you can see that we then get those messages from each one of the MPI processes mentioning that the GPU that we're running on is a compute capability 7.5–and technically the executable was for 7.0–but once again, just indications that we're using the GPU there. So that is lamps.

Gregory Kurtzer:

That's really awesome Forrest. Tell us a little bit about where this is actually running?

Where is Fuzzball Running? [32:25]

Forrest Burt:

This is running on a public cloud provider. I'm sitting here interacting in that video on the command line basically on my laptop. That laptop isn't connected to the public cloud provider in any way, it's literally just behind my own home modem and router. So I'm literally just working from my office, my home, or wherever you want to imagine. Through how fuzzball works, I'm able to reach out directly to that public cloud provider where this is running, and so that GPU node or those GPU nodes and stuff that you're seeing there, are all orchestrated at an entire other company's data center entirely through fuzzball. It is pretty cool. I will show you when we get to the TensorFlow one, actually being able to get a direct interface into that node. Are there any more questions that we have before we jump into GrowMax really quickly?

Zane Hamilton:

No Forrest. Appreciate it.\

GrowMax Demo [33:18]

Forrest Burt:

As I mentioned, I won't be running GrowMax live, but I'm going to use this as an example of some of the other capabilities of fuzzball to just show a little bit of housekeeping stuff, and how it makes management of results and management of things you've run in the past easy. I've just sat down here. I've been working on a CIQ optimized version of GrowMax and I'd like to get back to that work and continue testing it against the fuzzball cluster that I'm working on. I'm engineering a workflow, I'm engineering a container, I'm wanting to be able to see the performance of it. I want to be able to see it run live. Like I said, I'm sitting back down to get back to that work.

How do I Retrieve Past Results? [34:12]

How do I retrieve the past results that I was working on? If I just want to get them directly from the command line or maybe even if something has happened that I've lost them somehow. I'll show you here really quickly how you can not only get your results so you can see exactly where you were and exactly what happened in past workflows–especially without having to manage say output files something like slur would produce–and then I'll also show you how to, for example, retrieve a workflow that those results were written from or were derived from, all from within the fuzzball interface. So this'll be pretty quick. It's just a couple of commands and then we'll roll on over to the TensorFlow demo. If I'm here on the command line with my fuzzball CLI installed and I want to see some of the workflows that I've run in the past, I might run fuzzball workflow list.

I won't do that directly right here, because there's a lot of stuff that will come out of that. But rather I'm going to just specify a wild card, like a way to just pull out workflows matching a certain pattern. So say I was calling the last work I was doing with GrowMax GPU along with my accounts, if I want to just see those workflows that I was working on–the names and basic information about them–I'd just do a fuzzball workflow list. And then I specify my account: as I mentioned, user/ And then I just specify the name of the workflow that I was using before, and then a wild card basically. And you'll see when I do this, we'll get a few workflows that'll print out. So you can see we have a few there that I was working on in the past.

We see that a couple of them are aired, one of them is finished. Let's say we want to pull out the results from the one that finished successfully. I'll go ahead and copy this. I can do a fuzzball workflow log, and paste this in and you'll see we instantly get the results from this workflow. Really quick, before we get too much farther into this, I do want to make sure that I show something here. So I'm going to change my screen sharing really quickly. I'll tie into more of the workflow life cycle process here in a second. So we have some level of awareness of where this is coming from. This is VS code, a really common pattern for developing workflows thus far has been to just directly write them in the YAML and some type of text editor, whether that be literally something like notepad, something like VIM, or in this case we're using VS code.

So the results that we just generated on the command line from this GrowMax workflow came from this workflow right here. I'll go over it really quickly so you can see what GrowMax looks like as a fuzzball workflow. First off up here at the top, you'll see that we have this volume section. This is basically the same thing that we saw within that lamp's workflow. This is a volume called V1, i's a type ephemeral, so it'll only exist for the lifetime of the workflow. The URI here is basically just pulling down some GrowMax benchmarks off the open internet. And then once again, we're replacing that into the top level of the state of volume with this name right here, so that's ephemeral volume. Once again, that'll be created, ingress will be done, jobs can interact with it, egresses happen at the end of the workflow, the volume is destroyed.

So after that section, we have jobs. In the case of this workflow, we have four different jobs, untar, prepare benchmark data, run benchmark, and cat. The first one up here is just a simple untar. You can see, like with all of these, this looks like a fairly long workflow but we'll see in a moment that a lot of these are fairly boilerplate sections between these different jobs. First off you can see here that the image we're specifying in this case is something custom being pulled out of a GitLab registry, called CIQ cloud GrowMaxas. I mentioned, we're maybe taking a look at making a CIQ optimized version of that. You can see that I'm specifying some credentials here because this is a private lab registry.

I'm specifying, username and an access token basically as a password that'll be sent to this URI. Once it goes to pull down that image, so we don't get permission errors, you can see I pulled that image down there and I've specified credentials for it. The command here is just basically a simple tar command. So tar, no same owner , xf. Just unpacking that data there into this workflow volume so that the rest of the workflow can access it. You'll see that we're also setting a current working directory of the /data directory, which is where we've via this mounts field, mounted this data volume to this workflow job. You can see that we're specifying /data here, so this tar command can access this benchmark tarball that we've downloaded.

We do that and that all works correctly. We have a really basic resource request, just one core and one GB of it memory, because this isn't a terribly big file. So this will be pretty simple to unpack. Once we're done unpacking that we'll use the data in there to prepare the benchmark that we want to run. Once again, we're using the same image and the same configuration as from the last job. Our command is deferred. In this case, we're basically just running a command that's going to do some pre-processing for this workflow. We're doing it in a slightly different directory. This is the directory that we unpacked from the tar GZ in the previous stop. Then we're inside of one of the directories of benchmark data that's inside of that.

There's a few different directories worth of it, we're using a commonly used one. You can see that we've also altered our resource requirements for this job. We're now asking for two cores, 14 GB bytes memory in a GPU, we're mounting the same data volume at the same place, but we also have this required field that creates the directed acyclic graph of execution that this workflow is going to follow. In this case, you'll see that we have a pretty basic graph going on here but this will not run until untar is done, because obviously we need that data present there in order for this to be able to run and actually work on the benchmark data that needs to pre-process. Once that's done, we go to the run benchmark, you'll see that we're running this in MPI. And I'll also point out, the results that I'm showing you is also an MPI over two nodes with the GPU on each. I'll point out how you can tell that.

We're looking for two nodes with the open MPI implementation of our MPI wrapper. We're pulling down the same images we have in before. The command that we're running is the step that actually does the computations for this benchmark. So this is what'll print out all of our performance, stuff like that. We are going to have two cores, 14 G bytes in memory and one GPU again. But once again, those resources are going to be found per node. So in total, we'll get two cores, 28 GB bytes in memory, and two GPUs available for this step to use, with that open MPI wrapper. And you can see mounting the same data volume, and we just require that the benchmark data be prepared and that job be done.

The last one of these is pretty simple. It's basically just counting the results file that comes out of this. So we don't have to retrieve it from somewhere after having it egress from this workflow, just for simplicity's sake. You can see that, like with some of the past jobs, we're in this data directory again so that we can actually find this. We've got cores, GB and memory, and this will only run once the rest of the workflow is done. So you can see it looks like there's a lot here, but a lot of it is pretty similar between the workflow jobs. We basically have just a straight path of execution through this. So this one runs, this one runs, this one runs, this one runs. So I'll move back over to the results, and we'll take a look at what those actually indicate from this. So let's go ahead and go back over. As we just were here on the command line, I'll just scroll here to the top so we can see everything that was printed. Yes, that looks good. Cool. You can see once again, this is that workflow that I just got the log from that I ran a couple of days ago that I'm looking to retrieve the workflow and the information about it from. We have prepared benchmark data logs here being printed out from that workflow job, so you can see that we're basically just printing out a bunch of information about what the simulation is going to look like and a lot of technical information about that. Once we actually start running the benchmark, you'll see that we have the command line, we have some information about GPS being selected, some information about what component is going to do what type of computations, you can see we're using four MPI processes, and16 open MP threads per process–which I believe that doesn't matter. That's just an option that's sitting there–so four MPI processes, two cores across two nodes, ends up as four. You will see that once we get that data out, it goes and prints out all of the run benchmark data. We get down here to the bottom, we get at the end of the benchmark, we get a little bit of performance info, and then with the cat command we–or that cat job–like I said, we basically just cat the log file that this produced with all the results.

So as we look through that, we can see instructions that we're using. This is built with AVX-512–this is something I'm working on see–we have GPU support for QUDA, we're using QFFFT, which is our QFFFT library. You can see we have here, running on two nodes with a total four cores, four processing units, two compatible GPUs, compatible GPUs per node, all nodes have identical types of GPUs, and then we have a description of what type of GPU is actually present on each node. It goes into more information about the performance of each component, and then–like I mentioned here at the end–we get 1.608 nanoseconds per day. This is a fairly small compute node. So that's not terribly fast. I don't believe as far as GrowMax goes, but like I said, this is a fairly small compute node so we don't expect to get pretty crazy performance out of that. So that is GrowMax; two nodes at once with one GPU on each node through fuzzball. Are there any questions about this one? Otherwise I will hop over into TensorFlow and we'll take a look at the workflow life cycle there.

Zane Hamilton:

Yeah, we actually do have a couple of questions. So one of them is if you Daisy-chain workflows together, can you query the full data Providence of the final output? That's a good question.

Daisy-Chaining Workflows Together [44:45]

Forrest Burt:

I'm not sure I fully know the meaning of the term data prominence in this.

Zane Hamilton:

So being able to validate who did what at what time. Being able to say at each step the data was touched by this piece of an application and make sure that you can say what did what to the data as it flows through multiple workflows,

Forrest Burt:

Daisy-Chaining workflows from workflow to workflow, that would be a little bit difficult in a cloud type environment, just because of how our security model works now and some other things work. But like I said, I don't really have a particularly great answer about what our capabilities look like there. I haven't done a ton of experimentation there, but I'm sure that with a lot of the capabilities that we're looking to expand into–as Greg alluded to–the data being such an important part of HPC, I'm sure that that's something that we're looking into. We have really robust plans for data management–like Greg alluded to–data gravity is a big concept in HPC and so we definitely want to make sure that we're not making the data component of these ever increasingly complex HPC workflows the factor that makes them difficult to run.

Zane Hamilton:

So Greg, I guess it kind of goes back to being able to have that breadcrumb back through how a job ran and being what point in time it touched what. You could do it that way, but I don't know from a full Daisy chain. You could go back through each workflow and say what happened at each point in time, but it's not going to be actually stamped into the data output.

Gregory Kurtzer:

Yeah. You could use persistent volumes as well as a way of caching a storage location or a storage layer that you can come back to with another workflow. But aside from that, the goal of this is for each workflow to be a defined composable configuration for a particular outcome. So think of each workflow as that standalone solution that you can take anywhere, and it should be able to run anywhere free of dependencies as a single unit. Now we are developing additional capabilities around event based workflows as well as behavioral based workflows. A behavioral based workflow would basically be something like, being able to–I wouldn't call it string workflows together–but be able to conditionally trigger other workflows to execute and round.

Zane Hamilton:

The next question. I think we're going down the same vein here–and Robert, I know that this is something we talked about–with Apptainer being able to have a cryptographically signed type application so what ran when. You can go back and say, "this is the version it ran." But the same problem with the data, it's not necessarily going to apply to the data. You won't be able to pull the data and say, this is what happened at that point in time, but you can go back and recreate it right. And say, this is what happened.

Multiple Cryptographic Signatures [48:03]

Robert Adolph:

Yeah. And that's the point of a lot of the logs that we're going to produce is to give you the ability to do multiple different workloads and think of them as workflows versus potentially different data. And yes, multiple cryptographic signatures can be applied and enforced across the entire platform. So nowhere on that platform will run anything that doesn't have the cryptographic signatures needed for approvals, for example. So versioning, approvals from cryptographic signatures, the ability to share in an immutable way, that's extremely important to what we're thinking about.

Gregory Kurtzer:

And you can apply those cryptographic signing and validation to basically managing the Providence of all of your compute artifacts. So there's a lot you can do with that.

Zane Hamilton:

I hope that answers your question. There's more to come around that topic though for sure. We appreciate the input. Cool. Back to you Forrest.

TensorFlow Demo [49:12]

Forrest Burt:

Alrighty. We'll go ahead and hop into the TensorFlow demo. We're going to look at a couple of things here; first off, we're going to look at what this workflow actually looks like, we're going to look at what this workflow looks like in our graphical environment for editing workflows. I've shown you the kind of text based version with YAML. I'll show that to you again here, but then I'll also kindly give you a sneak peek for our dedicated viewers of what's going on with–like I said–the graphical side of fuzzball. And then I will hop onto a compute node that I have running at the moment with this TensorFlow and Jupyter notebook example that I've got here running, and I'll show you TensorFlow running on GPU for some fun. So we're here to talk about the workflow life cycle process and what it actually takes to create a workflow from start to finish that does something meaningful for us using our high performance computing resources.

There are a few steps to defining a fuzzball workflow and like I said, bringing one to completion. The first one is defining what we actually want the workflow to do, having what programs we want, what data is going to be involved, that type of thing. What are the requirements for our software to run? What do we actually want our software to run? That type of stuff, defining our use case essentially. After that we get into building a container, in fuzzball everything is built out of containers. The next step is to create a meaningful representation of the data that we've–sorry, lost my train of thought–to create a meaningful...oh gosh, I really just lost my train of thought. Sorry.

Oh yeah. We want to create a meaningful version of this container that allows us to take the application that we want to use with this workflow and actually run it through fuzzball because as I mentioned, everything in fuzzball is containers. After that is actually creating the definition of the workflow itself on a syntactic level, figuring out what each job needs to do, what each job needs, that type of thing. Then after that is actually testing and using the workflow and production. So if we want to go ahead and start from start to finish, what it looks like to bring this TensorFlow example into reality. The first thing that we might want to start with is–like so many things in technology–Google. If I'm wanting to figure out this TensorFlow workflow, first off–as I mentioned–we'll define what we want the workflow to do and what data it's going to need.

We want to be able to run TensorFlow, we want to be able to run TensorFlow in a GPU, and we want it to be able to provide a Jupyter notebook interface into this workflow so that we can develop in TensorFlow with all the convenience of Jupyter notebook. We'll also need to have a way for the models that we're messing around with in TensorFlow to bring their data into this workflow so it can be accessed there. We might start out building this workflow, now that we have what it's defined, we're going to be looking at building a container for it. We're either going to develop one ourselves, or we're going to discover one out there that's premade and fits our needs. So we'll just start off with something simple; TensorFlow GPU, just to get an idea of what exactly this looks like to run TensorFlow in a GPU.

We could look at Jupyter in a moment, but we'd want to start out here just with the basic application that we're even looking to run. We might scroll through the page talking about GPU support, looking at what it takes to install TensorFlow, looking at what their requirements are to use it on a GPU, looking at what needs to be installed in order to enable usage of that GPU. We're getting a feel–as I mentioned–for what it's going to take to develop a container that will be able to run TensorFlow, run TensorFlow in a GPU, and then provide us a Jupyter notebook interface into that. It'd be great if we could find something pre-made though. Oh, and what's this? We even find here on this GPU page that they mentioned that there's TensorFlow Docker images that are pre-made out there.

Well, to save ourselves the development time, we may as well investigate those to see if one of those meets our needs. We scroll through this downloaded TensorFlow Docker image, reading, taking a look at what's going on and, oh, what's this? There just so happens to be a pre-made TensorFlow container out there that includes both GPU support and a Jupyter notebook inside of it, so that's very useful. We'll probably give that a try first to figure out how we're going to build up this workflow and–as I mentioned–we'll see if this pre-made one fits our needs before we spend our time trying to build a container on our own. So just to switch back over into VS code really quick, maybe we've spent some time defining what our workflow is going to look like and now we're giving it a look over here to get a feeling for what we need to do, start out with debugging, what still needs to be done, essentially you can just imagine this as being the end of the step of actually writing the workflow definition.

As mentioned, a lot of this thus far has been done within the context of a text editor, like visual studio code. So in this case, I'm going to show you how that works here. You can see we have this TensorFlow workflow, we have the DSL version up here at the top, we have the data volumes that we're going to be creating down here. We're not doing an initial ingress because it's perfectly possible, we could do that if we wanted to–we could definitely set up that data ingress to happen for the data that some of the examples we're going to be running with TensorFlow–we could definitely set that up to happen through this volume section, but because of how fuzzball allows workflow jobs to interact with this data volume that we're going to amount to this job we can use TensorFlow or for example, pie torch or whatever ML framework we're using.

If they've already got tooling that allows you to automatically download and bring those data sets in–like a lot of them do. We can definitely take advantage of that as well as we'll see here in a few moments. We've got this data volume. We're not doing an ingress, but we are mounting it to this so that model can have a place to store its data. We only have one job that's to run TensorFlow. We're basically asking for similar resources to what we've asked for before; two cores, 14 GB bytes in memory, and a single Invideo GPU. You'll see we're specifying we don't want threads and we don't want the memory to be by core. So in this case, we're going to look for 14 GB on the whole node and not 14 GB per core. We're also saying this will be non-exclusive.

So this could hypothetically–I'm not sure exactly how this works at the moment–but exclusive is essentially a way that would allow you to tell whether or not you want this compute node to just be used for this job. We have a policy here that sets this to time out after 24 hours. As you'll see, I started an instance of this workflow quite some time ago and I'm still able to just open it up and access it. So you can imagine starting one at the start of the day and then being able to utilize that same node, that same workflow, and the same Jupyter notebook interface you're developing in there throughout the day, instead of having to constantly get a new development session or something like that.

The command we're running–as I mentioned–the data volume here is mounted in /data and then we have the command being run here, which is basically just starting up the Jupyter notebook interface on port 88. And then right down here–as I alluded to–we have TensorFlow latest GPU Jupiter, which is the container that we just found out there online. We also mentioned that we want an isolated network, which basically just sets this up so we can port forward to it. Which we'll do in a moment to actually get access to the Jupyter notebook interface. I could develop this and I could work with this just fine from say the terminal NVS code. I could do fuzzball workflow start, I can fuzzball workflow status, I can retrieve logs, I can do my container builds as I'm testing them, all that stuff, all through this.

But just to show you a little sample of where things are going, as far as that goes, I am going to make sure this is ready and then go ahead and share this screen right here. This is our graphical interface. This is very much a beta work in progress, so this will probably change over time quite drastically. But this is just an example of where we're going with providing a non CLI interface fuzzball to use.

So you can see we're here in the workflow editor. If I want to edit that workflow that I was just showing you there in VS code, all I have to do is just open the file, TensorFlow.YAML. You can see that I'm brought in here and I can click on the one job that we have. And over here set all those parameters that I was just showing in that YAML essentially. So we have all the command over here, we have the policy, we have the use of a isolated network namespace, the resources that we wanted to set up, the container that we were setting up–and you can see how this makes it much simpler than just writing YAML to be able to develop and represent a workflow.

Jupyter Notebook Component [57:56]

So really quick, because we're starting to run out of time. I'm going to hop into the terminal that I prepared for this to show the Jupyter notebook component of this, so just one moment to hop into that. I have this running, as you can see, it's been running for some time, so this is fairly stable. I'll be able to hop into this without any issues and be able to get right back in here where I was working before. So let's do fuzzball workflow log to get the access token that this needs in order for us to be able to get into this interface. So it's fuzzball workflow log, and then just getting what was printed when that Jupyter notebook command was run, then do fuzzball workflow port forward to indicate that I want to put forward this workflow.

And as I mentioned, I'm not connected to the public cloud provider, this is running on at all. This is literally my home modem and router. But fuzzball is about to bring this workflow directly to my laptop here and give me an interface into this public cloud provider's data center in order to be able to do my work on. So we'll specify the name of the workflow, the name of the job, and then the ports that we want to link. So in this case, we'll do 88:88, and then we'll go ahead and control click this to bring up that interface. I'll go ahead and switch my sharing to that so we can see that Jupyter notebook there.

Gregory Kurtzer:

And the format Forrest of 888:888 is a remote container port to a local port. Is that correct?

Forrest Burt:

Give me just one moment; fuzzball workflow port forward. That is a local port remote port.

Gregory Kurtzer:

Local port to remote port. Okay.

Forrest Burt:

I had to cancel that port forward. So we'll refresh this. There we go. Okay. Can everyone see the Jupyter notebook? Cool. So you can see I'm here on this interface, because this is port forwarded to this port 88:88 here on my local machine. I can access this interface being run on the public cloud provider that this is running on. If I go over here to the terminal; I can open a terminal real quick Nvidia SMI, showing that we have a GP available here, this Tesla T4. We have no processes running on it at the moment and no memory of it being used. We'll go ahead and head into this TensorFlow tutorials directory that we have here. I will go ahead and show, kind of like we did in the last pie torch demo that we did. There's no problem with me being able to upload stuff into this–I'm not sure if you can see the file upload window–but when I click upload, it basically just brings the file picker on my system up. I'll select a TensorFlow with GPU notebook, upload, and you can see that's instantly out there on this public cloud provider space for me to be able to use. I'll open this classification one first however, so we can run through it real quick.

This is basically just a fashion mnist example of Jupiter, so this will be pretty quick. We're just going to import TensorFlow and some of the other code that we need, we'll go ahead and download the fashion mnist database. So I think that this is the download step. Yes, there we go. So you can see that data downloads–as I mentioned–we have a place for this to be put into so that it gets downloaded easily. We'll just run some of the other bits of code that this needs to set this all up, show one of the images from the training data that we just downloaded, normalize and show I believe, a few more samples, set up the layers for our network and then we'll actually go ahead and run the model. So we'll notice that this will start printing out some information about some training [inaudible]. And if we go over here to Nvidia SMI, we should be able to see a little bit different output. Yes. You'll notice that we're now using almost 14 GB of GPU memory on this node, and you can see that because of quirks of our containers and their hosts work, we don't see that there's the name of the running process. We don't see an indication that there's no running processes on the GPU, so we can be fairly certain that this is doing some computing work there.

Here's the actual training code. So you can see that because we've sent some of that data to the GPU. That's why we get that data there. If we go and look at it now we can see this GPU having utilization and doing work training here. So we're going to do 10 books really quickly. You'll notice that our accuracy of our model is increasing over here. So we're at about 89%, about 90% accuracy. Once this is done, we'll move back into other things. Cool. So that's done, we got about 91% accuracy in the end. We'll go ahead and evaluate what the accuracy of it is across our test data. It's about 88%, so that's not too bad. We'll go ahead and make some predictions and have it print out some numbers about it. We'll get some graphs that look a little better here in a second. You can see that we have basically done some testing and inference here. Correct guess, an incorrect guess buyer model, a bunch of guesses from our model, over the test data that we brought in earlier. And then we can just run the end of this, make a prediction about a single image, see that the model is pretty sure about it. And then, we'll close this out.

I can go over here and just as easily, once I'm done with this, shut down this. And that will mess up my forward a little, so I will go ahead and reset that. Once I'm done with that, I can come over here. Oops. It's because I closed that kernel. Oops, sleep page. There we go.

So we should now see there's no memory on the GPU and no running processes because we shut down that kernel. If we were to come over here and open another Jupyter notebook kernel, you can see that I need to delete this and this. And you can see that when we run this code here in this specific GPU notebook, we find the GPU at the device node that the GPU is up. And then we'll go ahead and run this code here to compute the speed up this GPU gives us over the CPUs that we have on this node on the public cloud provider. Give that a few moments and in the meantime, check and see if anything is hitting the GPU, and we can see that it is. We now have all the memory on this being used once again and then we have a 73 times speed up GPU over CPU we found. So, yes, my apologies. This has run a little long, but that is TensorFlow running on fuzzball with a GPU through a public cloud provider that this is all orchestrated on.

The Role of DOI [01:04:59]

Zane Hamilton:

That's great stuff for us. Thank you very much. So we had another question that came in asking about using DOI. I'm going to the next question, because they're a little bit related. So if you published a set of data with one DOI and then you came back and you did another set at a different DOI, how could you get those different sets or have it where somebody could recreate either one of those sets? And it goes back to again from what Robert was talking about earlier–correct me if I'm wrong–but being able to say from a cryptographically application side, this is the steps that ran to produce the data and you can go back and rerun that. So on the data side, the same previous answer, something we're looking at or working on, but not necessarily going to be something you find in the data itself. Unless Greg or Robert want to tell me I'm wrong. It's not going to surprise me either.

Gregory Kurtzer:

I think I was following. And I think it's a yes.

Zane Hamilton:


Gregory Kurtzer:

I'll go with it.

Zane Hamilton:

And if you want to dive into that more, if you want to have a specific conversation, go ahead and send a note to and we'll set up some time and actually get in and talk about that one on one. Maybe we're missing some context that we need to give you a better answer and maybe talk about the roadmap of how we can get to what you're trying to accomplish. So feel free again,

Gregory Kurtzer:

And also containers do that as well. And actually people have DOIed containers themselves. So you can actually publish your scientific container and or your data and other people can actually come back and leverage somebody's work in a paper, reproduce the outcome, validate the outcome and then use it for, follow on work and research. So that is definitely something that happens.

Forrest Burt:

Sounds like it provides about the bottom line and reproducibility, that you can easily see how instead of having to replicate, say it's been a couple years since the papers come out, you can easily see how instead having to somehow replicate that environment with potentially at that point legacy software or even if there's a complex environment for something recent that comes out that your lab is willing to look at. You can see how just like Greg says there, people have even, I believe the term is DOI containers, but you can see that fuzzball being based on those makes it super easy to be able to reach out, get that container, put it somewhere where fuzzball can access set and then instantly be running the same thing that whoever published that container was running. So with the same software install, same versioning, essentially, like I said, the bottom line in reproducibility,

Gregory Kurtze:

I believe that the cool people call it DOI. Just saying. I'm not a cool person. So I'm just guessing. I think it was also 1.21 gigawatts, if I remember correctly.

Zane Hamilton:

Yes. 14 would've been way more than enough for time travel.

Gregory Kurtzer:


Forrest Burt:

I think somebody just sent me a picture of someone saying that from the movie.

Gregory Kurtzer:

So thank you for that, Forrest, that was totally cool to see.

Forrest Bur:

Absolutely. I'm glad that it all worked out and that it was interesting to everyone. So very cool.

Zane Hamilton:

So we are over time. I wanted to just say thank you, Robert, Greg, as usual for joining us. Forrest, thanks for putting in the work to get those demos done and share that with us. It's exciting to see. I know that we'll be seeing a lot of changes coming up. We'll be sharing as we go through. Again, if you have any questions, reach out to us at CIQ, we'd be happy to get on a call, meeting, whatever you need to help answer your questions. If you like what you see here, go ahead and like and subscribe and we will see you again next week. Thanks for the time.

Gregory Kurtzer:

Thanks everybody. Thank you, Zane.

Forrest Burt:

Thank you everyone. Thanks Forrest.