Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/jensen-huang-%E2%80%93-will-nvidia%E2%80%99s-moat-persist-podcast-with-dwarkesh-patel.25014/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2031070
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Jensen Huang – Will Nvidia’s moat persist? Podcast with Dwarkesh Patel

user nl

Well-known member
712,805 views Apr 15, 2026 Dwarkesh Podcast

I asked Jensen about TPU competition, Nvidia’s lock on the ever more bottlenecked supply chain needed to make advanced chips, whether we should be selling AI chips to China, why Nvidia doesn’t just become a hyperscaler, how it makes its investments, and much more. Enjoy!+𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒
𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒
  • Crusoe's cloud runs on state-of-the-art Blackwell GPUs, with Vera Rubin deployment scheduled for later this year. But hardware is only part of the story—for inference, Crusoe's MemoryAlloy tech implements a cluster-wide KV cache, delivering up to 10x faster TTFT and 5x better throughput than vLLM. Learn more at https://crusoe.ai/dwarkesh
  • Cursor helped me build an AI co-researcher over the course of a weekend. Now I have an AI agent that I can collaborate with in Google Docs via inline comment threads! And while other agentic coding tools feel like a total black-box, Cursor let me stay on top of the full implementation. You can try my co-researcher out at https://github.com/dwarkeshsp/ai_cowo..., or get started on your own Cursor project today at https://cursor.com/dwarkesh
  • Jane Street spent ~20,000 GPU hours training backdoors into 3 different language models, then challenged my audience to find the triggers. They received some clever solutions—like comparing the base and fine-tuned versions and extrapolating any differences to reveal the hidden backdoor—but no one was able to solve all 3. So if open problems like this excite you, Jane Street is hiring. Learn more at https://janestreet.com/dwarkesh
To sponsor a future episode, visit https://dwarkesh.com/advertise.

𝐓𝐈𝐌𝐄𝐒𝐓𝐀𝐌𝐏𝐒
00:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains?
00:16:25 – Will TPUs break Nvidia’s hold on AI compute?
00:41:06 – Why doesn’t Nvidia become a hyperscaler?
00:57:36 – Should we be selling AI chips to China?
01:35:06 – Why doesn’t Nvidia make multiple different chip architectures?

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Nvidia CEO Jensen Huang clarified in an April 2026 interview with Silicon Valley podcast host Dwarkesh Patel that the company allocates GPUs based on a first-come, first-served principle rather than a highest bidder wins approach.

Huang explained that Nvidia prioritizes GPU distribution by evaluating customers' demand forecasts and purchase orders (POs), then considers whether data center infrastructure is ready before allocating units according to order timing. Customers without completed infrastructure may receive lower priority to maximize overall production efficiency.

Addressing rumors of price-based allocation, Huang firmly denied any such practice, stating Nvidia's quoted prices are final and do not increase due to rising demand.

He stated that the company aims to be a reliable foundational supplier for the industry, capable of fulfilling massive AI infrastructure orders with stable commitments.
 
The back and forth regarding selling chips to China has generated a lot of discussion. I think Dwarkesh, as a 25 years old I believe, did a great job to calmly challenge Jensen's spin.
 
My take on the current moats:
* Gaming / Professional Graphics - limited chip capacity - AMD would rather spend limited allocation on AI and x86, Intel on-chip still no substitute for discrete hardware.
* HPC/other scientific, etc. - CUDA, associated libraries and ecosystem
* AI Training - market growing more slowly than inference, entrenched market share.
* AI Inference - rack/pod-level co-optimization of GPU/LPU/memory/storage/interconnect/rack HW/models/software stack and associated open source ecosystem, plus supply chain

Would love to hear other people's thoughts.
 
My take on the current moats:
* Gaming / Professional Graphics - limited chip capacity - AMD would rather spend limited allocation on AI and x86, Intel on-chip still no substitute for discrete hardware.
* HPC/other scientific, etc. - CUDA, associated libraries and ecosystem
* AI Training - market growing more slowly than inference, entrenched market share.
* AI Inference - rack/pod-level co-optimization of GPU/LPU/memory/storage/interconnect/rack HW/models/software stack and associated open source ecosystem, plus supply chain

Would love to hear other people's thoughts.

Professional - generally agreed, though I think this moat is drying up slowly here for Nvidia. x86 iGPUs continue to grow in capability, and Apple's unified memory architecture is pulling a number of traditional Professional Graphics applications users over to Mac.

For Gaming - I don't see this as a real moat anymore. The total # of discrete GPUs sold has been steadily decreasing for a long time, and there are far more people playing mobile games than PC games. Nvidia has only Nintendo as a console customer, while AMD has Sony, Microsoft, the SteamDeck, and various other handhelds. Both Intel and AMD also have pretty strong iGPUs now that compete up to the 50 and even 60 series Nvidia GPUs in some cases.

HPC/Scientific - agreed fully.

AI Training and Inference - My gut tells me the moat here is only short term. The Chinese are building their own entire stack for this, and hypercloud providers and even some smaller cloud providers are building their own full-stack solutions here. Amazon, and Tesla/SpaceX for example. Nvidia will probably remain the gold standard for a long time but I don't think they have a monopoly-style moat here in 2030.
 
even some smaller cloud providers are building their own full-stack solutions here. Amazon, and Tesla/SpaceX for example.
A lot will depend on cost per token vs interactivity profile of different suppliers plus effectiveness of agents. If NVIDIA gets to a place where they offer the best Pareto curves for all of the leading models, even with their margins, thanks to economies of scale (5-10 different specialized custom chip/systems per generation at a 1 year cadence, leveraging the most efficient rack & interconnect, plus the most efficient model/software stack), they become like TSMC. They may already be there - Nobody is showing better Pareto curves in data center scale benchmarks.

Amazon's Pareto frontier with Trainium 3 doesn't come close (big latency issues, hence Cerebras), and their interconnect has been too focused on standard CPU connectivity. Google TPU is probably closer, especially with their split this next generation between 8t and 8i. We'll see if Tesla/SpaceX ever becomes serious, but AI6 isn't an anywhere near a solution for data center. Who knows with China - the usually economics don't apply.
 
A lot will depend on cost per token vs interactivity profile of different suppliers plus effectiveness of agents. If NVIDIA gets to a place where they offer the best Pareto curves for all of the leading models, even with their margins, thanks to economies of scale (5-10 different specialized custom chip/systems per generation at a 1 year cadence, leveraging the most efficient rack & interconnect, plus the most efficient model/software stack), they become like TSMC. They may already be there - Nobody is showing better Pareto curves in data center scale benchmarks.

Amazon's Pareto frontier with Trainium 3 doesn't come close (big latency issues, hence Cerebras), and their interconnect has been too focused on standard CPU connectivity. Google TPU is probably closer, especially with their split this next generation between 8t and 8i. We'll see if Tesla/SpaceX ever becomes serious, but AI6 isn't an anywhere near a solution for data center. Who knows with China - the usually economics don't apply.

The other curveball is - what's needed for peak AI inference performance keeps changing. FP vs. Integer, precision levels, and even techniques for 'thinking' and 'MoE' have somewhat different requirements for compute. Nvidia definitelty has a leg up, but like any technology - there's only so many novel techniques before diminshing returns. The same fate of CPUs (whose exponential growth story was replaced by GPUs), could happen to Nvidia as AIs needs crystalize.

Another point on Amazon, Chinese, and SpaceX/Tesla is that they're chipping away at Nvidia's TAM by making their own products. That reduces Nvidia's economies of scale, and when combined with above -- the moat begins to dry.
 
Back
Top