The AI arms race has entered a new phase… one where speed alone no longer defines leadership.
The real contest spans physics, power grids, and global supply chains, all straining under the weight of ever-larger models and rising compute demands. Nvidia, already the dominant force in AI processors, is pressing its advantage, pushing deeper into the infrastructure layer that determines how far and how fast artificial intelligence can scale.
Enter Vera Rubin, a sprawling, liquid-cooled super system that Nvidia says will deliver 10 times more performance per watt than its predecessor, Grace Blackwell. In a world where AI models are ballooning in size, data centers are straining power infrastructure, and memory shortages loom like gathering storm clouds, efficiency has become the new currency of compute.
Scheduled to debut later this year, Vera Rubin is not just another hardware refresh. It is Nvidia’s latest bid to define the next phase of AI infrastructure, where raw power, modular design, and energy economics collide.
A look inside Vera Rubin
Here are some quick facts about Vera Rubin:
- It comprises 72 Rubin graphics processing units (GPUs) and 36 Vera central processing units (CPUs), mainly sourced from Taiwan Semiconductor Manufacturing Co. (TSMC).
- Its other parts, such as liquid cooling elements, power systems, and compute trays, come from more than 80 suppliers in at least 20 countries, the company told CNBC.
- The racks are manufactured in the US and other countries. They weigh close to two tons and have about 1,300 total microchips, compared with Grace Blackwell’s 864.
CNBC characterized Vera Rubin as “a simpler, modular system intended to ease installs and repairs.” Each superchip slides out of one of the rack’s 18 compute trays in seconds. In contrast, in the Blackwell system, those components are soldered to the board.
When Grace Blackwell went into production in 2024, it was considered a game-changer because of how much compute was possible with a single system.
So far, Meta has said it will use Vera Rubin in its data centers by 2027. Other customers include OpenAI, Anthropic, Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, according to Nvidia.
The power conundrum
Even as it touts Vera Rubin’s greater efficiency, one challenge Nvidia faces is rising memory costs driven by the seemingly insatiable appetite for AI. This is driving an unprecedented memory shortage, which is not expected to go away anytime soon.
And while Nvidia is the market leader in AI chips, it faces growing competition from AMD, Broadcom, and Google.
Nvidia said that while the new system will consume about twice as much power as its predecessor, it will be significantly more efficient because of that 10 times return on performance per watt.
Dion Harris, head of AI infrastructure at Nvidia, told CNBC that the company is “aligning to make sure that everything we’re shipping will be met by our supply chain. We’re in good shape.”
Harris also said that Vera Rubin is the first Nvidia system that is 100% liquid cooled, which helps data centers consume “much less water” than traditional evaporative cooling.
In its Q4 earnings report released Wednesday, Nvidia CEO Jensen Huang said the “agentic AI inflection point has arrived,” and praised both AI systems. “Grace Blackwell with NVLink is the king of inference today — delivering an order-of-magnitude lower cost per token — and Vera Rubin will extend that leadership even further.”
Noting that enterprise adoption of agents is “skyrocketing,’’ Huang also said that customers are clamoring to invest in AI compute.
Still, as significant data center growth is planned to meet that demand, many are facing scrutiny and environmental backlash over the amount of power they generate, with some industry analysts cautioning that many may not ultimately get built.

