From TechCrunch: There’s a shortage of GPUs as the demand for generative AI, which is often trained and run on GPUs, grows. Nvidia’s best-performing chips are reportedly sold out until 2024. The CEO of chipmaker TSMC was less optimistic recently, suggesting that the shortage of GPUs from Nvidia — as well as from Nvidia’s rivals — could extend into 2025.
To lessen their reliance on GPUs, firms that can afford it (that is, tech giants) are developing — and in some cases making available to customers — custom chips tailored for creating, iterating and productizing AI models. One of those firms is Amazon, which today at its annual re:Invent conference unveiled the latest generation of its chips for model training and inferencing (i.e. running trained models).
The first of two, AWS Trainium2, is designed to deliver up to 4x better performance and 2x better energy efficiency than the first-generation Trainium, unveiled in December 2020, Amazon says. Set to be available in EC Trn2 instances in clusters of 16 chips in the AWS cloud, Tranium2 can scale up to 100,000 chips in AWS’ EC2 UltraCluster product.
One hundred thousand Trainium chips delivers 65 exaflops of compute, Amazon says — which works out to 650 teraflops per a single chip. (“Exaflops” and “teraflops” measure how many compute operations per second a chip can perform.) There’s likely complicating factors making that back-of-the-napkin math not necessarily incredibly accurate. But assuming a single Tranium2 chip can indeed deliver ~200 teraflops of performance, that puts it well above the capacity of Google’s custom AI training chips circa 2017.
View: Full Article