Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/jensen-huang-on-strategy-and-vision-nvidias-future-physical-ai-rise-of-the-agent-inference-explosion-ai-pr-crisis.24792/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030970
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Jensen Huang on strategy and vision: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

user nl

Well-known member
Interesting podcast with Jensen Huang at GTC 2026:


Mar 19, 2026 All-In Podcast

(0:00) Jensen Huang joins the show!
(1:00) Acquiring Groq and the inference explosion
(9:27) Decision making at the world's most valuable company
(11:22) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing
(17:12) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes
(21:22) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future
(31:24) Open source, global diffusion, Iran/Taiwan supply chain impact
(40:19) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions
(48:06) Datacenters in space, AI healthcare, Robotics
(56:44) OpenAI/Anthropic revenue potential, how to build an AI moat
(59:38) Advice to young people on excelling in the AI era
 
(1:00) Acquiring Groq and the inference explosion

Thought this section was great - very interesting key bullet points:

Disaggregated Inference and Groq's Role
* AI data centers are turning into huge "computers" with masses of heterogeneous processors, connectivity and storage racks
* All to service disaggregated inference at the far lowest cost per token
* Managed by a data-center (token factory) OS, Dynamo
* Groq plays a key role as agents grow in usage for fast, low latency decode and tokens - Jensen advocating adding 25% Groq to existing and future AI data centers to better support agents
* Fast TTFT/TPS tokens are more valuable than slower ones. Pareto curves of cost/power per token vs interactivity (TTFT, TPS) are critical to understanding benefits of NVIDIA and best cost curves.

Three New Basic Flavors of Compute
* Data center training (and inference)
* Physical (physics-based AI)
* Edge AI

One caveat - Jensen says NVIDIA "invented" disaggregated inference, but I think the timeline was more complicated.
* 2023 - Researchers at Peking University and UCSD develop Distserve, the first disaggregated inference system
* Late 2024 - DeepSeek delivers fast commercial disaggregated inference that also uses multi-headed latent attention
* March 2025 - NVIDA rolls out more general open source disaggregation data center "OS" called Dynamo, that will work with the whole NVIDIA ecosystem and beyond.
 
Last edited:
Back
Top