You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
Nvidia's Vera Rubin appears to be focused on improving higher end / "higher value" AI inference workloads. Where Blackwell improved "35X over Hopper" for "Free and Medium tiers", Vera Rubin only improved 2-3X at the lower tiers (still good!), but brings a "35X improvement" at the high end.
A later slide showed the effect of adding Groq-3 chips (heavy on SRAM, optimized more for latency than bandwidth) to Vera Rubin arrays, and that pushed the speed further 'to the right', enabling even higher end tiers (more guarenteed tokens/second for customers).
Mr. Huang unveiled a product incorporating technology from a start-up called Groq. The product will pair Nvidia’s chips, which excel at receiving an A.I. request, with Groq’s chips, which have components that can put a charge into how Nvidia’s chips operate.
Over the past year, A.I. companies have shifted their work. The A.I. systems they built using Nvidia’s chips have improved at creating software code, doing research and making images and videos. These capabilities, the result of a process known as inference, have put more value on chips that can generate data as inexpensively and quickly as possible.
......................
Nvidia’s deal with Groq also helps it with manufacturing problems that are constraining how fast its sales can grow, said Umesh Padval, a managing partner at the investment firm Seligman Ventures. Groq’s chips are made by Samsung Electronics, not Taiwan Semiconductor Manufacturing Company, which makes most of Nvidia’s chips and is struggling to meet the company’s demand, Mr. Padval said. And unlike Nvidia’s chips, Groq’s don’t require high-bandwidth memory chips. Those chip manufacturers have also been swamped with orders.
“It’s a brilliant supply-chain move,” Mr. Padval said.