The Great AI Silicon Shortage

user nl · Mar 13, 2026

The Compute Shortage

Token demand is skyrocketing and the need for AI compute continues to accelerate. The improvement in model capabilities combined with the rapid emergence of agentic workflows has driven a surge in user adoption and aggregate token demand. Anthropic added a staggering $6B of ARR in the single month of February alone driven by broad adoption of agentic coding platform Claude Code, and if Anthropic had more compute they would have added more. Despite a huge AI infrastructure buildout over the past few years, available compute is scarce. On-demand GPU prices continue to go up even for Hoppers which are almost 2 generations old.

From our own experiences, we have reached out to every neocloud we know asking if they have small clusters available, but everything is already firmly locked up. This tight supply environment explains the sharp reset in hyperscaler capex plans. Consensus estimates have moved materially higher across the board, with Google standing out as the most extreme example, where 2026 capex expectations have roughly doubled versus prior expectations, primarily driven by datacenter and server spend.

The Great AI Silicon Shortage

TSMC N3 Wafer Shortages, Memory Constraints, Datacenter Bottlenecks, Supply Chain Wars Winner

newsletter.semianalysis.com

Xebec · Mar 13, 2026

I find curious the two charts on TSMC N3 wafer capacity, and N3 wafers shipped: They show N3 wafer capacity decreasing QoQ in Q4 2024 through Q3 2025, but N3 wafers shipped somehow increases quarterly each time.

..

That aside, I found it insightful that switching memory DRAM to HBM has a 3:1/4:1 impact on wafer used per bit.

MKWVentures · Mar 13, 2026

the capex forecast tells it all. That is why we have a shortage.

Reminder... HBM isnt the only thing short and DDR5 prices and margins have exploaded. MOST of Memory company increase in revenue is coming from DDR5 price increases, not just HBM

user nl · Mar 13, 2026

There is also today a very long youtube interview/Dwarkesh Podcast with the main author, Dylan Patel, on this topic.
It seems like Dylan Patel is suggesting that the availability of ASML EUV tools is becoming the next bottleneck for the AI buildout (starts around 34 minutes):

Mar 13, 2026 Dwarkesh Podcast

Dylan Patel, founder of SemiAnalysis, provides a deep dive into the 3 big bottlenecks to scaling AI compute: logic, memory, and power.And walks through the economics of labs, hyperscalers, foundries, and fab equipment manufacturers.Learned a ton about every single level of the stack. Enjoy!

Daniel Nenni · Mar 14, 2026

Here is a 500 word AI summary:

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)

The rapid expansion of artificial intelligence is driving an unprecedented surge in global demand for computing infrastructure. As discussed in a recent conversation between Dwarkesh Patel and Dylan Patel, the scale of investment in AI infrastructure has reached historic levels. Major technology companies—including Amazon, Meta, Google, and Microsoft—are projected to spend hundreds of billions of dollars on capital expenditures related to AI data centers, chips, and power infrastructure. These investments highlight a central reality of the modern AI race: the primary constraint is no longer software innovation alone, but the physical infrastructure required to run large-scale AI systems.

One of the key insights from the discussion is that AI compute capacity scales on timelines much longer than software development cycles. Large technology firms are not simply purchasing servers or GPUs for immediate use. Instead, a significant portion of their capital expenditures is allocated toward long-term infrastructure projects, such as building data centers, securing power generation capacity, and pre-ordering semiconductor manufacturing capacity years in advance. For example, companies often place deposits on gas turbines or long-term power purchasing agreements several years before the corresponding compute infrastructure becomes operational. As a result, the massive spending figures seen today reflect investments that will come online gradually throughout the decade.

The compute demands of leading AI laboratories further illustrate the scale of the challenge. Companies such as OpenAI and Anthropic already operate clusters measured in gigawatts of power consumption. A single gigawatt-scale AI data center can require tens of billions of dollars in infrastructure and hardware investment. As AI models grow larger and more widely deployed, these labs must continuously expand their compute capacity not only to train new models but also to serve inference workloads for millions of users. Consequently, much of the capital raised by AI labs is dedicated to securing long-term compute access rather than immediate operational costs.

However, expanding compute infrastructure is constrained by several bottlenecks across the semiconductor supply chain. The most critical components include advanced logic chips, high-bandwidth memory (HBM), and the manufacturing equipment used to produce them. Companies such as Nvidia dominate the market for AI accelerators, while the fabrication of advanced chips is concentrated at manufacturers like TSMC. Even further upstream, the production of lithography equipment by ASML ultimately determines the maximum number of advanced chips that can be produced globally.

The supply chain complexity is immense. For instance, producing a gigawatt of cutting-edge AI chips requires tens of thousands of advanced semiconductor wafers and millions of lithography process steps. Each step depends on specialized equipment with long production lead times, making rapid scaling extremely difficult. As a result, even if data centers and power generation can expand quickly, the semiconductor manufacturing ecosystem may still limit overall compute growth.

In addition to chip manufacturing, memory production has emerged as another major constraint. High-bandwidth memory, which enables AI accelerators to process massive datasets efficiently, is significantly more resource-intensive to manufacture than conventional memory. As AI demand rises, memory manufacturers are redirecting production capacity away from consumer electronics and toward AI hardware, potentially increasing prices for devices such as smartphones and laptops.

Ultimately, the expansion of AI infrastructure depends on a complex interplay between technological innovation, supply chain capacity, and global economic investment. While software breakthroughs remain essential, the next phase of AI development will increasingly be determined by the ability to scale physical infrastructure—from semiconductor fabs to power generation—to support the immense computational demands of advanced AI systems.

user nl · Mar 14, 2026

Daniel Nenni said:
Here is a 500 word AI summary:

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)
.........................................................

As AI demand rises, memory manufacturers are redirecting production capacity away from consumer electronics and toward AI hardware, potentially increasing prices for devices such as smartphones and laptops.

Ultimately, the expansion of AI infrastructure depends on a complex interplay between technological innovation, supply chain capacity, and global economic investment. While software breakthroughs remain essential, the next phase of AI development will increasingly be determined by the ability to scale physical infrastructure—from semiconductor fabs to power generation—to support the immense computational demands of advanced AI systems.

Nice general AI-summary of the 2-hour podcast. Still, I found it very interesting to listen to all the reasoning, arguments and details Dylan Patel gave (and sometimes not gave, as you have to pay for them!) about all the supply line issues and manufacturing issues in the semi Foundry world, in relation to scaling AI. Dylan seems to love numbers!

One of them is that Apple seems to be really becoming a much less important customer for TSMC relative to the AI/HPC customers. And with that diminishing position they may get fewer privileges with TSMC regarding capacity. It seems that the phone business market is preparing for a significant reduction in units to be shipped, especially in the low and medium segment the coming two years. And the price of the iPhone will go up.

Also interesting how he discusses Elon's plans for building your own fabs for "1 million wspm" and that Patel doesn't believe AI datacentres will move to the sky in the foreseable future (next decade).

Let's see when the new fab shells are ready in 2028/2029 and if ASML can/will ship some 100 EUV tools (or more) around 2030, and whether China will have its first EUV alpha tool by 2030. Interesting times and predictions.

Daniel Nenni · Mar 14, 2026

user nl said:
Nice general AI-summary of the 2-hour podcast. Still, I found it very interesting to listen to all the reasoning, arguments and details Dylan Patel gave (and sometimes not gave, as you have to pay for them!) about all the supply line issues and manufacturing issues in the semi Foundry world, in relation to scaling AI. Dylan seems to love numbers!

Let me ask you this, how do you view Dylan Patel's credibility as a semiconductor industry expert?

user nl · Mar 14, 2026

Daniel Nenni said:
Let me ask you this, how do you view Dylan Patel's credibility as a semiconductor industry expert?

Let me be humble, I do not know him and I'm also not a semiconductor industry expert. I'm sure there may be many errors in all his models and data. There will be many experts in this SemiWiki community, like you and others, who can quickly point to errors in some of his tables, figures and statements.

For me his newsletters give some helicopter views and sometimes more detailed technical views on certain aspects in the semi/AI tech space.

One detail he gave in the youtube podcast was about the effective number of wafers that ASML's EUV tools processes per hour. For his grand calculation of the 200 GW compute he predicts as a limit for AI compute in 2030, he used 75 wafers/hour.

Now if you look at the specs that ASML provides for some of the latest NXE:3600D and 3800E, ASML specs

https://www.asml.com/en/products/euv-lithography-systems/twinscan-nxe-3600d
https://www.asml.com/en/products/euv-lithography-systems/twinscan-nxe-3800e

say 165 to 220 w/hr at 30 mJ/cm2.

That is about a factor of 2-3 larger than Patel uses. I'm not expert enough to know if TSMC uses more say 60 mJ/cm2, or that somehow multipatterning is the reason of this factor of 2-3 difference? Experts here in this community probably know.
But it seems like a crucial number providing either 200 GW or more like 400-500 GW of compute, which would provide a different headline conclusion.

Anyway, we will learn this year more about how many EUV tools ASML is anticipating to ship the coming years, so that they will not be the bottleneck for the AI compute demand in 2030 to fill the new fabs of TSMC, SK Hynix, Samsung etc.....

Some people seem not to like Patel:

https://www.aol.com/analyst-dylan-patel-gets-inside-090002369.html
However, his outspoken nature occasionally leads to high-profile online feuds. For example, OpenAI CEO Sam Altman once referred to him dismissively as "that semianalysis guy" on X, which Patel famously countered with an edited image of Google CEO Sundar Pichai. While generally respected, some of his more specific claims have occasionally met with skepticism from the tech community, such as certain GPU deployment numbers for specific AI models.

Daniel Nenni · Mar 14, 2026

user nl said:
Let me be humble, I do not know him and I'm also not a semiconductor industry expert. I'm sure there may be many errors in all his models and data. There will be many experts in this SemiWiki community, like you and others, who can quickly point to errors in some of his tables, figures and statements.

I do know Dylan, we tried to work together when he first got started. I would say he is a borderline genius with very strong opinions that sometimes are viewed as fact. He also has flair for the camera so definitely an asset for the industry as long as he is taken in context. Dylan is like ChatGPT, collects data, analyzes, and presents. But does that make ChatGPT a semiconductor industry expert?

Some of his early work was not good and when experts from SemiWiki pointed out his errors Dylan got very defensive, almost paranoid as if we had an agenda against him. He then wrote an article comparing TSMC to a Drug dealer and that was the end of our collaboration.

I own my bias against so call industry experts that have never worked in said industry but I do feel there is a big difference between people who are outside of an industry looking in versus people who are inside the industry looking out. We need both of course but they are a different type of expert. Thankfully I get to do both.

SemiWiki is experiencing a significant growth spurt and I feel it is a result of people like Dylan who are making semiconductors clickable news. Our Active Readers doubled in 2025 and 2026 is already up 60%+ year to date. Exciting times in the Semiconductor industry, absolutely.

blueone · Mar 14, 2026

Some of Patel's posts and articles have definitely improved since he has hired several people with deeper expertise in chip design and AI to give him better material to work with. On the other hand, he doesn't have much credibility with me yet about semiconductor process or manufacturing. I'm actually more stringent about what I read in process and manufacturing because I don't have the expertise to check his work as I read it.

Daniel Nenni · Mar 14, 2026

blueone said:
Some of Patel's posts and articles have definitely improved since he has hired several people with deeper expertise in chip design and AI to give him better material to work with. On the other hand, he doesn't have much credibility with me yet about semiconductor process or manufacturing. I'm actually more stringent about what I read in process and manufacturing because I don't have the expertise to check his work as I read it.

Yes I should have added that. He has a staff of people now so it is a collaboration which is a big positive.

user nl · Mar 14, 2026

Daniel Nenni said:
.....................................................

I own my bias against so call industry experts that have never worked in said industry but I do feel there is a big difference between people who are outside of an industry looking in versus people who are inside the industry looking out. We need both of course but they are a different type of expert. Thankfully I get to do both.

SemiWiki is experiencing a significant growth spurt and I feel it is a result of people like Dylan who are making semiconductors clickable news. Our Active Readers doubled in 2025 and 2026 is already up 60%+ year to date. Exciting times in the Semiconductor industry, absolutely.

For what it is worth, Gemini responds this to my question "What are some of the best expert websites to learn about the semiconductor industry?"

If you want to move beyond surface-level news and understand the "guts" of the industry—the physics, the equipment economics, and the geopolitical chess board—these are the websites and experts that professionals actually follow.

1. Deep Technical & Architectural Analysis

SemiAnalysis: (The source of your original query). They are arguably the gold standard for connecting the dots between chip design, fab capacity, and financial markets. Essential for understanding "wafer-level" economics.
TechInsights: They are famous for "reverse engineering." They physically tear apart new chips (like the H100 or the latest iPhone) and look at them under electron microscopes to see exactly how they were made. Their Microprocessor Report is a legendary industry staple.
WikiChip / WikiChip Fuse: A massive, peer-reviewed technical encyclopedia. If you want to know the exact gate pitch or metal layers of a specific TSMC node, this is where you go.

2. Manufacturing & Fab Strategy

SemiWiki: Founded by Daniel Nenni, this is the "open forum" for semiconductor professionals. It features daily blogs from veterans who have spent 30+ years in the fabs. Excellent for learning about EDA (Electronic Design Automation) tools.
Semiconductor Engineering: This site focuses on the hard engineering challenges—lithography, packaging, and testing. It’s less about "stock prices" and more about "how do we stop these chips from melting?"
IMEC Insights: IMEC is the world's premier R&D hub in Belgium where companies like ASML and TSMC test technology 10 years before it hits the market. Their blog is a preview of the future of 1nm and beyond.

3. Business & Geopolitics

Fabricated Knowledge: A newsletter by Doug O'Laughlin. He provides some of the best breakdowns of how specific company earnings (like Nvidia or AMD) actually relate to the physical supply chain.
Digits to Dollars: Focuses on the "business of chips." They are great at explaining why certain companies succeed or fail based on their strategy (e.g., why Intel is struggling vs. why Nvidia is winning).
Stratechery (Ben Thompson): While it covers all tech, Thompson’s "Intel and the Strategic Deficit" and his interviews with CEOs like Jensen Huang are required reading for understanding the high-level strategy of the industry.

4. Industry Trade Bodies (The "Official" Data)

SEMI.org: The global industry association. They track the "Book-to-Bill" ratios and how many new fabs are being built globally. Their World Fab Forecast is the definitive data source for equipment spending.
SIA (Semiconductor Industry Association): The best source for understanding government policy, the CHIPS Act, and global trade statistics.

user nl · Mar 14, 2026

Daniel Nenni said:
Yes I should have added that. He has a staff of people now so it is a collaboration which is a big positive.

From that 2024 AOL BI story:
https://www.aol.com/analyst-dylan-patel-gets-inside-090002369.html

BI: Where do you get your information?

The team consists of people across the entire supply chain, geographically and technically. We have someone who previously worked as low as ASML (Advanced Semiconductor Materials Lithography), the largest equipment company, and then all the way through to companies upstream, such as former employees of Microsoft, Nvidia, etc. So we have the entire view of the supply chain, from manufacturing, up to models. Also, several people on the team are from hedge fund backgrounds.

I think the team goes to more than 50 technical conferences across the entire stack — not just the trade shows, which are not really as valuable, but more the technical conferences where people are presenting papers and advancements in the field.

When you establish all these research and consulting relationships, you end up with a large well of information. That is how we're able to get disparate pieces of information that may be known and even common in one part of the supply chain but don't spread far outside it. Engineering especially, is very horizontal. Trying to draw the lines between all the different parts of the stack is very important.

Daniel Nenni · Mar 14, 2026

user nl said:
SemiAnalysis: (The source of your original query). They are arguably the gold standard for connecting the dots between chip design, fab capacity, and financial markets. Essential for understanding "wafer-level" economics.

TechInsights: They are famous for "reverse engineering." They physically tear apart new chips (like the H100 or the latest iPhone) and look at them under electron microscopes to see exactly how they were made. Their Microprocessor Report is a legendary industry staple.

This must be a big hit for TechInsights. They are the actual gold standard. TechInsights bases their analysis on actual silicon not hyperbole. Of course how would AI really know that? I will ask Scotten Jones what he thinks about SemiAnalysis. Last time I asked it was not favorable. Scotten is the Gold Standard for fabs.

Daniel Nenni · Mar 14, 2026

WikiChip / WikiChip Fuse: A massive, peer-reviewed technical encyclopedia. If you want to know the exact gate pitch or metal layers of a specific TSMC node, this is where you go.

Or you can ask ChatGPT because all of the WikiChip data is in the LLMs.

ChatGPT really burned this house down. Who even goes to Wikipedia anymore?

Markwrob · Mar 14, 2026

Daniel Nenni said:
Who even goes to Wikipedia anymore?

I for one use it all the time. Keeping in mind that it is crowdsourced. I favor it over AIbots for at least a few reasons.

In a sense Wikipedia is used by everyone using Aibots since they all trained on Wikipedia along with everything else they sucked off the web.

Daniel Nenni · Mar 14, 2026

Markwrob said:
I for one use it all the time. Keeping in mind that it is crowdsourced. I favor it over AIbots for at least a few reasons.

In a sense Wikipedia is used by everyone using Aibots since they all trained on Wikipedia along with everything else they sucked off the web.

Agreed, everything in Wikipedia is already built into the LLM. Just ask ChatGPT a question and reference Wikipedia and there you go. A new Wikipedia usage model that may actually kill Wikipedia at some point in time.

hist78 · Mar 15, 2026

user nl said:
The Compute Shortage
Token demand is skyrocketing and the need for AI compute continues to accelerate. The improvement in model capabilities combined with the rapid emergence of agentic workflows has driven a surge in user adoption and aggregate token demand. Anthropic added a staggering $6B of ARR in the single month of February alone driven by broad adoption of agentic coding platform Claude Code, and if Anthropic had more compute they would have added more. Despite a huge AI infrastructure buildout over the past few years, available compute is scarce. On-demand GPU prices continue to go up even for Hoppers which are almost 2 generations old.

From our own experiences, we have reached out to every neocloud we know asking if they have small clusters available, but everything is already firmly locked up. This tight supply environment explains the sharp reset in hyperscaler capex plans. Consensus estimates have moved materially higher across the board, with Google standing out as the most extreme example, where 2026 capex expectations have roughly doubled versus prior expectations, primarily driven by datacenter and server spend.

The Great AI Silicon Shortage

TSMC N3 Wafer Shortages, Memory Constraints, Datacenter Bottlenecks, Supply Chain Wars Winner

newsletter.semianalysis.com

One serious uncertainty that can greatly impact this AI silicon shortage is the U.S. and Israel's attacks on Iran and the endless retaliations and counter‑retaliations. It’s out of control. I don’t believe Trump has control over it either.

Paul2 · Mar 15, 2026

user nl said:
One of them is that Apple seems to be really becoming a much less important customer for TSMC relative to the AI/HPC customers. And with that diminishing position they may get fewer privileges with TSMC regarding capacity. It seems that the phone business market is preparing for a significant reduction in units to be shipped, especially in the low and medium segment the coming two years. And the price of the iPhone will go up.

Apple is buying millions of dies per month. All of the newly forged "AI" stuff is just incomparable except Nvidia because they share dies with their high volume videocard business. No small volume, ultra high value chip business will ever approach them in volume. TSMC treads a very dangerous route by putting so much into these low volume customers demanding exquisite node specs regardless of yields.

I recall the "oil chips" stuff around Y2K. That company made an ASIC made for looking for oil, and they had absolutely ridiculous investor deals with big Wall Street names. Everything worked, except the chip seemingly never ever been sold, and none of oil prospecting business was ever built with it. No one understood at the time that how much oil they would find with that chip had zero to negative zero connection to how exquisite their semi process was, and how fast the chip was.

All what the buzz was raving about was found to be completely irrelevant, and meaningless. "Why no one have seen something so obvious?" And after the dot com bubble popped, everyone involved tried to wash off their names from the story because it just made them look stupid for falling for that patently obvious waste of money.

In 20 years, people will look back at this time period of status symbol electronics, and scratch their heads.

Paul2 · Mar 16, 2026

blueone said:
Some of Patel's posts and articles have definitely improved since he has hired several people with deeper expertise in chip design and AI to give him better material to work with. On the other hand, he doesn't have much credibility with me yet about semiconductor process or manufacturing. I'm actually more stringent about what I read in process and manufacturing because I don't have the expertise to check his work as I read it.

I feel he doesn't know the fundamental economics of the semi foundry business. There cannot be a prognosis to be made on anything in semi space without knowing whether some fancy new technology in the process will make the node make or lose money...

The Great AI Silicon Shortage

Well-known member

The Compute Shortage​

Well-known member

Moderator

Well-known member

Founder

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)​

Well-known member

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)​

Founder

Well-known member

Founder

Well-known member

Founder

Well-known member

1. Deep Technical & Architectural Analysis​

2. Manufacturing & Fab Strategy​

3. Business & Geopolitics​

4. Industry Trade Bodies (The "Official" Data)​

Well-known member

Founder

Founder

Active member

Founder

Well-known member

The Compute Shortage​

Well-known member

Well-known member

The Compute Shortage

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)

The Global Race for AI Compute: Infrastructure, Semiconductors, and the Bottlenecks Ahead (≈500 words)

1. Deep Technical & Architectural Analysis

2. Manufacturing & Fab Strategy

3. Business & Geopolitics

4. Industry Trade Bodies (The "Official" Data)

The Compute Shortage