From Tom's Hardware: In a bid to offer unbeatable performance, Nvidia had planned to use four GPU chiplets in its Rubin Ultra AI accelerator due in 2027. However, due to concerns about the manufacturability of such a solution, the company decided to cancel it in favor of a dual-GPU design that is easier to produce, according to SemiAnalysis.
Nvidia's Rubin Ultra GPU with four compute chiplets was arguably one of Nvidia's most ambitious projects in recent years, as it not only doubled performance compared to the original Rubin (which uses two compute chiplets), but also increased the complexity of Nvidia's data center GPUs to levels never seen before. However, connecting four near reticle-sized dies using existing advanced packaging technologies is a tremendous engineering challenge, and cooling four complex dies and 16 HBM4E modules is hard and costly. As a result, due to 'manufacturing execution concerns,' Nvidia reportedly canceled Rubin Ultra in its four compute dies form in favor of a design with two compute chiplets. Note that the information is unofficial, so take it with a grain of salt. We've reached out to Nvidia for comment.
As a consequence, Nvidia's 'new' Rubin Ultra would be around half as powerful as the original one, which would certainly make it less competitive against contending offerings, namely AMD's Instinct MI500-series. Of course, Nvidia will still likely optimize its Rubin Ultra design to squeeze some additional performance out of the AI accelerator to justify the upgrade.
Also, keep in mind that Nvidia's Rubin Ultra uses HBM4E memory instead of HBM4 used by the original Rubin. Furthermore, starting with Rubin GPUs, Nvidia plans to offer liquid-cooled Kyber rack-scale systems that increase GPU count per scale-up domain to at least 144 packages, which will increase compute performance that Nvidia will sell to its customers.
View: Full Article