Cerebras CS-1 – the smallest supercomputer for artificial intelligence, built on the basis of the largest processors

At Supercomputing 2019, held recently in Denver, USA, Cerebras Systems introduced a supercomputer designed for artificial intelligence systems, which is based on the largest processor to date. The computing power of this supercomputer, CS-1, is equivalent to the computing power of a system consisting of hundreds of computer racks filled with GPUs consuming hundreds of kilowatts of energy. At the same time, the dimensions of the CS-1 system make up one third of the size of a standard rack, and it consumes only 17 kW.

“The CS-1 system is the fastest computer for artificial intelligence,” says Andrew Feldman, founder and CEO of Cerebras Systems. “Our previous generation system built on the basis of Google TPU tensor processors takes ten racks and consumes 100 kilowatts, while providing only one third of the processing power of the CS-1 system.

The CS-1 supercomputer is designed to accelerate the learning process of large neural networks, a process that can take weeks to complete under normal conditions. Powered by a processor of 400 thousand cores, the CS-1 supercomputer must cope with similar tasks in minutes and even seconds. However, the architecture features of this system make it impossible to demonstrate its performance using standard tests, for example MLPerf, instead Cerebras Systems gives its potential customers the opportunity to see for themselves by training their own models of neural networks on a dedicated CS-1 computer.

We can say that this approach paid off, and the first customer who already had their own CS-1 supercomputer at their disposal was the Argonne National Labs, and the next system would be sent to the Lawrence Livermore National Laboratory.

The CS-1 system runs the standard Pytorch and Tensorflow software, which is optimized as much as possible for the architecture of this system. Moreover, the features of this architecture make it easy to increase the power of the system by connecting several systems in parallel. As part of the experiment, 32 CS-1 systems were connected in parallel at Cerebras Systems and received a 32-fold increase in computing power.

Structure computer CS-1

“This distinguishes our architecture from architectures based on GPUs and central processors,” says Feldman, “when you group GPUs in a cluster, its behavior is different from the behavior of a single computing system. Despite all efforts, the behavior of the cluster remains total behavior a lot of small computers”.

The first areas in which CS-1 supercomputers will be used will be research aimed at finding a cure for cancer, which is carried out by specialists of the National Cancer Institute. After that, astronomers will receive a supercomputer, which will begin to simulate the behavior of black holes that are in the process of collision, and gravitational waves that are produced in this collision. Note that a task similar to the first of the above was already somehow solved in the past, and 1024 nodes out of 4,392 nodes of Theta supercomputer were involved for a long time to solve it.