Nvidia CEO Jensen Huang took the stage at San Jose's SAP Center on Monday for GTC 2026, the company's flagship annual conference, formally launching the Vera Rubin GPU platform to an audience of over 30,000 attendees from 190 countries.

A Generational GPU Leap

The Rubin GPU, built on TSMC's 3nm process, delivers a dramatic step up from Blackwell across every metric. With 336 billion transistors โ€” up from Blackwell's 208 billion โ€” the chip packs 288GB of HBM4 memory at 22 TB/s bandwidth, nearly triple Blackwell's 8 TB/s on HBM3e. Peak FP4 inference performance reaches 50 petaflops, a 2.5x to 5x improvement, while FP4 training lands at 35 petaflops, 3.5x faster than the prior generation.

The platform also introduces the Vera CPU โ€” 88 custom Olympus ARM cores connected to Rubin GPUs via NVLink-C2C at 1.8 TB/s โ€” specifically designed for orchestrating agentic AI workloads where CPUs have become the bottleneck.

Built for Agentic AI

Nvidia's framing at GTC 2026 reflects a broader industry shift. As AI moves from chatbots to multi-step agentic workflows, the demand profile is changing: more sequential general-purpose compute, heavier data movement between agents, and far higher token generation rates.

"These agentic systems are spawning off different agents working as a team," Huang said on Nvidia's earnings call last month. The Vera Rubin platform, with its tight CPU-GPU coupling and rack-scale NVL72/NVL144/NVL576 configurations, is engineered for exactly that workload.

What's Next

Vera Rubin entered full production in early 2026. Nvidia's roadmap points to Vera Ultra for the second half of 2027. N1 and N1X consumer laptop chips โ€” an ARM-based SoC co-developed with MediaTek โ€” are also expected to bring Nvidia's AI capabilities to thin-and-light Windows laptops later this year.