The Philosopher’s OS Platform
The philosopher's stone was a legendary substance that was believed to have the power to transform base metals into gold or silver and also to provide immortality to those who possessed it. The modern quest for the philosopher's stone is rooted in the belief that all data could be synthesized into something useful through the power of artificial intelligence.
The engineering alchemy of pulling it together can be intimidating. Building a powerful AI, machine learning (ML), natural language processing (NLP) infrastructure requires thoughtful planning. Many CIQ customers ask for guidance on how to approach this task, and one of the key factors to consider (regardless of hardware) is the base operating system. To support resource-intensive and complex AI workloads, the infrastructure needs to be robust, reliable, secure, highly performant, and supported by people who understand the peculiarities of HPC (High Performance Computing) workloads.
An enterprise-class Linux distribution like Rocky Linux offers stability, security, and support, which are critical for running complex and demanding AI workloads. Rocky Linux is built from the same upstream sources as Red Hat Enterprise Linux (RHEL), a collection of packages renowned for stability, reliability, and security. Rocky Linux is also designed to comply (and currently in the process of certifying) with various security standards, including Common Criteria, Federal Information Processing Standard (FIPS-140 Level 3), and Payment Card Industry Data Security Standard (PCI-DSS), ensuring that the AI workloads are secure and meet compliance requirements.
Rocky Linux offers a stable base for running various workloads, including ChatGPT. To run ChatGPT, you will need to install Python along with several other necessary packages and binaries, such as PyTorch and TensorFlow. OpenAI recommends installing TensorFlow to achieve optimal performance and compatibility with their pre-trained language models. For even better performance, it is advisable to install the CUDA toolkit from NVIDIA and the cuDNN library, which optimize deep learning operations. These requirements can be easily met by setting up a powerful GPU-enabled machine running Rocky Linux. In the past, similar High Performance Computing builds were used for engineering simulations in large compute clusters with low latency interconnects.
Rocky Linux also provides enterprises with scalability and flexibility. It is designed to run on a wide range of hardware architectures and support sophisticated cloud High Performance Computing shapes. For example, Oracle Cloud Infrastructure offers the highest price / performance levels for bare metal and virtual machine compute instances augmented by NVIDIA GPUs (A100, A10, V100, and P100). This allows enterprises to affordably choose the hardware that best suits their needs. Additionally, Rocky Linux supports containerization, which allows enterprises to easily deploy and scale their AI workloads using secure container orchestration platforms like Apptainer.
Deploying AI workloads on an enterprise-class Linux distribution like Rocky Linux can help enterprises reduce their total cost of ownership (TCO). Rocky Linux is a free and open source distribution, which means that enterprises can avoid paying licensing fees for the operating system. Additionally, CIQ provides long-term support at very attractive price points for up to ten years, ensuring that the AI workloads remain supported and up-to-date for an extended period, reducing the need for costly upgrades or migrations.
An important and final point I’ll touch on is the exponential value of community. The Rocky Linux community has established a Special Interest Group (SIG) dedicated to supporting the Rocky Linux ecosystem for AI, ML, data science, and big data. This SIG is focused on ensuring that Rocky Linux is optimized for AI/ML workloads and can serve as a reliable, high-performance platform for data scientists, machine learning engineers, and other professionals working with large datasets. It’s made up of volunteers from the community who have a strong interest in these fields and are passionate about advancing the use of Rocky for AI, ML, and NLP workloads. They collaborate to develop and maintain packages and tools that are essential for carrying out this mission on the Rocky Linux platform. The SIG welcomes new contributors and encourages anyone with an interest in AI/ML, data science, and HPC to get involved. By joining the SIG, contributors can help shape the future of the Rocky platform and make it an even more compelling choice for those working in these exciting fields.
In conclusion, using Rocky Linux as a platform for AI workloads is widely recognized as a best practice for multiple reasons. By adopting this approach, organizations can ensure that their AI workloads are running on a dependable, secure, and auditable infrastructure that complies with regulations. Additionally, the platform has a significant, vibrant, and constantly expanding open source community, which allows for efficient collaboration, development, and deployment of AI applications.
CIQ is the founding sponsor of the Rocky Linux project and provides enterprise-class support and services. They offer two levels of support, standard and advanced, with different hours of coverage and escalation efforts. Their promise is to support people, not count sockets or cores, and to empower individuals to achieve amazing things. So, you can maintain compliance and expand your capacity as needed without worrying about additional support expenses. Connect with CIQ – our engineering wizards and support alchemists can show you how to instrument a secure AI performance computing environment and offer better support, at a lower price point, than any other vendor on the market.