Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/jensen-huang-on-strategy-and-vision-nvidias-future-physical-ai-rise-of-the-agent-inference-explosion-ai-pr-crisis.24792/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030970
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Jensen Huang on strategy and vision: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

user nl

Well-known member
Interesting podcast with Jensen Huang at GTC 2026:

Special episode this week. We've preempted the weekly show and there's only three people we preempt the show for. President Trump, Jesus, and Jensen. And uh I'll let you pick which order we do that. Uh but what an amazing run you've had and a great event. Uh every industry is here. Every tech company is here. Every AI company is here. Incredible.


Mar 19, 2026 All-In Podcast

(0:00) Jensen Huang joins the show!
(1:00) Acquiring Groq and the inference explosion
(9:27) Decision making at the world's most valuable company
(11:22) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing
(17:12) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes
(21:22) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future
(31:24) Open source, global diffusion, Iran/Taiwan supply chain impact
(40:19) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions
(48:06) Datacenters in space, AI healthcare, Robotics
(56:44) OpenAI/Anthropic revenue potential, how to build an AI moat
(59:38) Advice to young people on excelling in the AI era
 
(1:00) Acquiring Groq and the inference explosion

Thought this section was great - very interesting key bullet points:

Disaggregated Inference and Groq's Role
* AI data centers are turning into huge "computers" with masses of heterogeneous processors, connectivity and storage racks
* All to service disaggregated inference at the far lowest cost per token
* Managed by a data-center (token factory) OS, Dynamo
* Groq plays a key role as agents grow in usage for fast, low latency decode and tokens - Jensen advocating adding 25% Groq to existing and future AI data centers to better support agents
* Fast TTFT/TPS tokens are more valuable than slower ones. Pareto curves of cost/power per token vs interactivity (TTFT, TPS) are critical to understanding benefits of NVIDIA and best cost curves.

Three New Basic Flavors of Compute
* Data center training (and inference)
* Physical (physics-based AI)
* Edge AI

One caveat - Jensen says NVIDIA "invented" disaggregated inference, but I think the timeline was more complicated.
* 2023 - Researchers at Peking University and UCSD develop Distserve, the first disaggregated inference system
* Late 2024 - DeepSeek delivers fast commercial disaggregated inference that also uses multi-headed latent attention
* March 2025 - NVIDA rolls out more general open source disaggregation data center "OS" called Dynamo, that will work with the whole NVIDIA ecosystem and beyond.
 
Last edited:
regarding OpenClaw, i saw it bust out on x.com where everyone is talking about. its a good first step toward Jarvis from Iron man. we went from simple chatbot to agent.

we will need more chip if this next phase of AI pop off. golden age of semis industry?
 
we will need more chip if this next phase of AI pop off. golden age of semis industry?

If you listen to Jensen, Cerebras and SambaNova, agents are going to demand a lot more low latency, high memory bandwidth (decode) processors so agents can use models that "move beyond conversation-speed interaction toward speed of thought computing. "

 
while there is no question NV has dominate position, and a lot of are due to the vision and execution. Jensen does have a tendency to re-write part of the history. He is no alone. As they say, winners get to write history.

Would be exciting to see a viable eco-system compete with NV at scale across the stack. The customized silicon for different applications might be the key to drive a lot more design work down the road
 
Would be exciting to see a viable eco-system compete with NV at scale across the stack. The customized silicon for different applications might be the key to drive a lot more design work down the road
I see two interesting developments here.
* Even though NVIDIA is doing vertical development of rack level data center AI products, they are also building out a hardware and software ecosystem to fill out their data center product offerings. On the hardware side, a bunch of storage companies including Vast and HPE, are building KV mass storage products for the KV cache tiering. And pretty much all the open source model serving frameworks, like llm-d, SGLang, vLLM, and LLMcache have been enhanced and optimized to run leveraging all the benefits of NVIDIA’s AI factory open source “operating system”, Dynamo. NVIDA has really created an offering that is far different than the closed AI rack-level environments at Amazon, Google, etc.
* I’m wondering if/when we will see other rack-level AI data center hardware system providers, who are leveraging disaggregated inference, like Amazon/Cerebras or some of the Chinese CDPs, adopt Dynamo, kind of like every server environment uses Linux today.

If that happened, we could issue in a new era of hardware diversity and interoperability for AI data center.
 
Last edited:
Back
Top