The design can run a large neural network more efficiently than banks of GPUs wired together. But manufacturing and operating the chip is a challenge, requiring new methods to etch the characteristics of silicon, a design that includes redundancies to account for manufacturing flaws, and a new water system to keep the giant chip cool.
To build a cluster of WSE-2 chips capable of running record-sized AI models, Cerebras had to solve another engineering challenge: how to efficiently get data in and out of the chip. Regular chips have their own on-board memory, but Cerebras has developed an off-chip memory box called MemoryX. The company has also created software that allows a neural network to be partially stored in this off-chip memory, with only the calculations being transferred to the silicon chip. And he built a hardware and software system called SwarmX that ties it all together.
“They can improve the scalability of training to huge dimensions beyond what everyone else is doing today,” says Mike demler, senior analyst at The Linley Group and editor-in-chief of The microprocessor report.
Demler says it’s not yet clear what market there will be for the cluster, especially since some potential customers are already designing their own more specialized chips in-house. He adds that the actual performance of the chip, in terms of speed, efficiency and cost, is not yet clear. Cerebras has not published any benchmark results so far.
“There is a lot of awesome engineering going into the new MemoryX and SwarmX technologies,” Demler says. “But just like the processor, these are highly specialized things; it only makes sense for the training of the greatest models.
Cerebras chips have so far been adopted by labs that need intensive computing power. Early clients include Argonne National Labs, Lawrence Livermore National Lab, pharmaceutical companies such as GlaxoSmithKline and AstraZeneca, and what Feldman describes as “military intelligence” organizations.
This shows that the Cerebras chip can be used for more than just powering neural networks; the calculations these labs perform involve equally massive parallel mathematical operations. “And they’re always hungry for more computing power,” says Demler, who adds that the chip could possibly become important to the future of supercomputing.
David Kanter, analyst at Real world technologies and executive director of MLCommons, an organization that measures the performance of different AI algorithms and hardware, says it sees a future market for much larger AI models in general. “I generally tend to believe in data-centric ML, so we want larger datasets that allow us to build larger models with more parameters,” says Kanter.