Using Apptainer to Run DALL·E mini
Image-generating AI models are currently taking the internet by storm, with some of the most popular and well-known of these being part of the DALL·E series of models created by OpenAI. The most advanced of these, DALL·E 2, can generate extremely realistic (or surrealistic) images from simple prompts written in natural language - including, for example, art in the style of many famous artists. Prompts are thus essentially limited only by your creativity - for example, “A cat in a spacesuit on the moon” or “A painting called ‘The Center of the Universe’ by Salvador Dalí” would both hypothetically produce valid, comprehensible results from DALL·E 2 - with these results indeed oftentimes being so shockingly novel that they redefine what people thought AI was capable of. Access to DALL·E 2 is at the moment granted through a waitlist, but there’s a freely-available open source alternative that’s a lot of fun to tinker with.
DALL·E mini is an open source AI model that produces very interesting and impressive results, despite being overall smaller and less complex than OpenAI’s original DALL·E model. Developed and trained on limited hardware resources and against a short deadline, DALL·E mini leverages other open source code and pre-trained models to increase its effectiveness - much more information about the project can be found at the DALL·E mini GitHub (https://github.com/borisdayma/dalle-mini), with reports diving into the model’s technical details available there as well. There are pre-trained versions of DALL·E mini that can be downloaded from wandb.ai; we recently deployed one of these through CIQ’s Fuzzball platform on public cloud GPU resources at a company webinar to generate images from prompts provided by our live audience.
Fuzzball, CIQ’s novel HPC 2.0 cloud/container-native high performance computing automation and orchestration stack, can easily get you up and running fast with using pre-trained models like those available for DALL·E mini. The code/container attached to this article is what we ran through a Fuzzball workflow at our webinar, and is adapted from the inference pipeline code and Dockerfile provided by the DALL·E mini developers at the GitHub linked above. The script will now output the images to the directory it’s run from rather than display them in a notebook, and will take command line input for three prompts to the model. The Apptainer container definition file can be used to build a Rocky Linux-based container that will include both the script itself and the tooling/dependencies necessary to run the script on a GPU. Please see the included README for further instructions.
An online, web version of the model that you can use to generate your own images can also be found at https://huggingface.co/spaces/dalle-mini/dalle-mini.
Here are our results from our webinar demo - again, all generated from live audience prompts by deploying the DALL·E mini model on public cloud GPU resources orchestrated with Fuzzball:
Prompt: “A squirrel wearing a sombrero in the jungle”
Prompt: “A rainbow colored cow fortune teller” (based on the classic Apptainer/Singularity lolcow demo)
Prompt: “A farmhouse, with a cow, on the sun”
Prompt: “A saddle on a cow in a car race”
Prompt: “A screaming gopher caught in a spiderweb”
Prompt: “Richard Nixon in a tutu throwing confetti”
Prompt: “A diver drinking coffee in a mountaintop cafe”
Prompt: “A screaming spider, caught in a gopher web”
Prompt: “Three dogs playing with an elephant on mars”
Prompt: “A bison on a keyboard”
Prompt: “A space shuttle backpack riding on an elephant”
Prompt: “A CIQ-branded supercomputer”
And if you’ve made it this far, we’d like to also share some renderings of our dear founder, Gregory Kurtzer, that we generated in the style of Picasso during testing of the same Fuzzball workflow that generated the above images:
Prompt: “A Picasso-style painting of Gregory Kurtzer”
If you would like to play with DALL·E mini yourself, here are the starting files you will need:
Contact us for starting files.
Meet the Author:
Forrest Burt
Forrest Burt is an HPC systems engineer at CIQ, where he works in-depth with containerized HPC and the Fuzzball platform. He was previously an HPC system administrator while a student at Boise State University, supporting campus and national lab researchers on the R2 and Borah clusters while obtaining a B.S. in computer science.