The one thing that’s always troubled AI developers is the time it takes to train AI models. More specifically, training the complex AI models requires extreme computational activity, directly affecting the pace of innovation.
Stepping forward to eliminate this issue is G42, an Abu Dhabi-based cloud computing firm, which has partnered with Cerebras, a US-based AI firm, to launch Condor Galaxy – the world’s largest supercomputer. Condor Galaxy is a network of nine interconnected supercomputers poised to transform AI computation and significantly reduce training time.
Let’s dive in and uncover the details behind the transformative supercomputer network.
Out of the nine planned, the companies have already launched the first supercomputer, Condor Galaxy-1. (CG-1) G42 officials said that CG-1 has the capacity of 4 exaFLOPs (a unit to measure the speed and performance of supercomputers) and 54 million cores. To give you some context, the largest computer after CG-1 is Frontier,which holds 1.1 exaFLOPs.
Cerebras and G42 offered CG-1 as a cloud service, allowing consumers to utilise the supercomputer without the need to manage or distribute models over physical systems. The two have announced plans to launch CG-2 and CG-3 in the US by early 2024.
This is the first time Cerebras has partnered to build, manage, and operate an AI supercomputer. Talking about the collaboration, Talal Alkaissi, CEO of G42 Cloud said: “Collaborating with Cerebras to rapidly deliver the world’s fastest AI training supercomputer and laying the foundation for interconnecting a constellation of these supercomputers across the world has been enormously exciting. This partnership brings together Cerebras’ extraordinary compute capabilities, together with G42’s multi-industry AI expertise”.
Training large AI models involves massive datasets, extensive computing, and specialized AI expertise. Condor Galaxy 1 delivers on all three fronts. The large-scale supercomputer network is effectively democratising AI, and simplifying AI computing.
Based in Santa Clara, California, CG-1 has already advanced AI models in Arabic bilingual chat, healthcare and climate studies. The supercomputer links 64 CS-2 systems (Cerebras’s flagship product) together in a single platform.
The CG supercomputer network focuses on simplicity and speed. CG-1 offers support for training with long sequence lengths of up to 50,000 tokens out of the box, without requiring any special software libraries. The supercomputers have the capacity to run the largest AI models without distributing work over thousands of GPUs.
Forrest Norrod, executive vice president and general manager of Data Centre Solutions Business Group of AMD, commented: “Driven by more than 70,000 AMD EPYC processor cores, Cerebras’ Condor Galaxy 1 will make accessible vast computational resources for researchers and enterprises as they push AI forward”.
G42 is taking active steps to induce an AI-driven digital transformation across different industries. Its work with different datasets will allow the supercomputer’s users to train new foundational models. CG-1’s cloud system dramatically reduces the training timelines and accelerates innovation.
Alkaissi says the system will be used to address and alleviate society’s most pressing challenges across such areas as energy, healthcare and climate change. He said: “In the UAE, there's a national AI plan, part of which is to use AI to improve productivity at all levels across the government and economy, and we believe the compute we are providing through this partnership will play a large role in the ability to create disruptive change in various different sectors”.
Both companies share a vision to accelerate innovation and drive a productive digital transformation. With CG-1, a giant step to solve industry challenges and boost AI expansion has been taken.
Andrew Feldman, CEO of Cerebras Systems, said: “Many cloud companies have announced massive GPU clusters that cost billions of dollars to build, but are extremely difficult to use. Distributing a single model over thousands of tiny GPUs takes months of time from dozens of people with rare expertise. CG-1 eliminates this challenge. Setting up a generative AI model takes minutes, not months and can be done by a single person”.
Cerebras and G42’s ambitious partnership is ground-breaking on many fronts. The two entities are delivering a productive full-service AI training solution by working with the combined strength of hardware engineers, data engineers, AI scientists, and industry specialists.
Their revolutionary product, CG-1, is set to accelerate innovation and support hundreds of AI projects across the globe. The massive supercomputer network holds a limitless potential to simplify AI model training and solve relevant industry challenges.