Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/analysis-ai-will-require-2t-in-annual-revenue-to-support-500b-in-planned-capex.23676/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030770
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Analysis: AI will require $2T in annual revenue to support $500B in planned CapEx

Xebec

Well-known member
Note: I am not familiar with this agency, but I thought the analysis of future shortages in revenue for AI firms, and continued impact to the Semi supply chain would be an interesting topic for this forum.

The article implies that significant breakthroughs in energy efficiency for AI will also be required to support currently projected demand and planned CapEx spends in this space.


Take-aways:
  • - AI’s computational needs are growing more than twice as fast as Moore’s law, pushing toward 100 gigawatts of new demand in the US by 2030.
  • - Meeting this demand could require $500 billion in annual spending on new data centers.
The economics become unaffordable. Bain’s research suggests that building the data centers with the computing power needed to meet that anticipated demand would require about $500 billion of capital investment each year, a staggering sum that far exceeds any anticipated or imagined government subsidies. This suggests that the private sector would need to generate enough new revenue to fund the power upgrade. How much is that? Bain’s analysis of sustainable ratios of capex to revenue for cloud service providers suggests that $500 billion of annual capex corresponds to $2 trillion in annual revenue.

What could fund this $2 trillion every year? If companies shifted all of their on-premise IT budgets to cloud and also reinvested the savings anticipated from applying AI in sales, marketing, customer support, and R&D (estimated at about 20% of those budgets) into capital spending on new data centers, the amount would still fall $800 billion short of the revenue needed to fund the full investment (see Figure 2).


1758646978082.png
 
I love how Bain just assumes 20% of the budget for personnel in sales, marketing, customer support, and R&D can be forked over as AI budget. Not only is head count gone, but the savings--the whole point of AI--are entirely consumed in the bonfire of 4.5x cost scaling up per year. I guess I can make the easiest prediction ever--that won't happen.

This makes China appear to be on the right path, at least with DeepSeek providing a hardware-light path forward.
 
I love how Bain just assumes 20% of the budget for personnel in sales, marketing, customer support, and R&D can be forked over as AI budget. Not only is head count gone, but the savings--the whole point of AI--are entirely consumed in the bonfire of 4.5x cost scaling up per year. I guess I can make the easiest prediction ever--that won't happen.

This makes China appear to be on the right path, at least with DeepSeek providing a hardware-light path forward.
Agree re: DeepSeek

I also get the sense that ChatGPT (and probably others) are also focused mainly on that token $/efficiency with recent releases.

This only anecdotal, but I've seen a lot of unique places sharing this experience -- GPT5 seems to be only slightly better than GPT4 overall, but also seems to vary a lot in 'smartness' from day to day.

I suspect OpenAI has more knobs to tune on the back end to keep GPT working "at speed" regardless of demand. i.e. More demand, it becomes "dumber" to respond just as quick as when there is less demand.
 
Note: I am not familiar with this agency, but I thought the analysis of future shortages in revenue for AI firms, and continued impact to the Semi supply chain would be an interesting topic for this forum.

Bain & Company is one of the top three management consulting firms. Its founder, Bill Bain, was a Boston Consulting Group (BCG) consultant assigned to Texas Instruments in the early 1970s, where he promoted the Learning Curve Theory.

At TI, Bill Ban's main point of contact was a young vice president, Morris Chang. Bill Bain left BCG in 1973 to start his own consulting firm, Bain & Company. Morris Chang applied the Learning Curve Theory at TI’s semiconductor division and, eventually, at TSMC, the company he went on to found.
 
GPUs will get replaced with dedicated AI accelerators. Inference will be the first thing to go. You already see Google do this but the trend will accelerate.

It is normal for application specific accelerators to be an order of magnitude more efficient than general purpose hardware.
 
GPUs will get replaced with dedicated AI accelerators. Inference will be the first thing to go. You already see Google do this but the trend will accelerate.

It is normal for application specific accelerators to be an order of magnitude more efficient than general purpose hardware.
People have been saying this for some years now and it seems to make obvious sense. But that hasn't stopped nVidia dominance. Or is that just the training side ? So why hasn't that happened ? Is it simply the ecosystem (CUDA etc) constraints ?
 
Good luck with that

I use multiple AI tools for research. There are ups and downs but right now Grok does the best job. I do not pay though, hopefully they stay free for personal use. I do see money to be made with company specific LLMs. I know a bank that is using it for employee empowerment. Answers that took several minutes to find are now available in seconds. Some answers that could not be found are found in seconds. A serious productivity tool and it helps fight against fraud which is running rampant in banking.

And when a customer is waiting in front of you minutes seem like hours! And saying you will have to get back to them tomorrow with the answer will get you a bad survey. Apparently anything under 8 out of ten is bad.
 

Analysis-OpenAI, under pressure to meet demand, widens scope of Stargate and eyes debt to finance chips


"We cannot fall behind in the need to put the infrastructure together to make this revolution happen," Altman said on Tuesday at a briefing with reporters, tech executives and politicians, including U.S. Senator Ted Cruz, and newly named Oracle co-CEO Clay Magouyrk. The briefing was held at a massive data center in Abilene, Texas, where OpenAI and its partners are rapidly building a data center.

Alibaba’s $53 Billion AI Blitz Ignites Stock Rally With Nvidia As the Secret Sauce

At the heart of the excitement lies Alibaba’s aggressive AI blueprint. The original $53 billion commitment targeted data centers, compute power, and model development to rival OpenAI and Baidu (NASDAQ:BIDU). Now, Wu vows to supersize it, funneling extra billions into global expansion: new data centers in Southeast Asia, Europe, and the U.S., plus a trillion-parameter Qwen3-Max language model that rivals GPT-4 in benchmarks. This isn’t pocket change, either. It’s a war chest to capture the $1 trillion AI market by 2030, according to McKinsey.

Trouble, trouble, boil and bubble..... :eek:
 
GPUs will get replaced with dedicated AI accelerators. Inference will be the first thing to go. You already see Google do this but the trend will accelerate.

It is normal for application specific accelerators to be an order of magnitude more efficient than general purpose hardware.
Datacenter GPUs are dedicated AI accelerators. They have tensor cores, HBM, and scale-up clustering interconnects. The Google (TPU) and AWS (Tranium and Inferentia) are no more specialized ASICs than Nvidia and AMD datacenter GPUs.
 
People have been saying this for some years now and it seems to make obvious sense. But that hasn't stopped nVidia dominance. Or is that just the training side ? So why hasn't that happened ? Is it simply the ecosystem (CUDA etc) constraints ?
The AWS and Google AI chips are targeted at being more cost-effective than Nvidia and AMD chips, and they are. Part of the AWS and Google efficiency also results from owning the entire hardware and software stack, just like Nvidia does, but to get the best efficiency you end up with apps that are to some degree hardware specific.

This is a comparison between AI chips from Nvidia, Google, and AWS I bookmarked in the spring. It's very good, IMO:

 
It is normal for application specific accelerators to be an order of magnitude more efficient than general purpose hardware.
The application-specific accelerators you're referring to which are often an order of magnitude more efficient are those based on state machine logic. Usually this strategy is used for specific algorithms and calculations, the most common examples in modern datacenters being encryption-decryption logic, the bitcoin mining algorithm, compression-decompression, some networking algorithms and protocols, and networking MACs. All of these examples are "little stuff", as a friend of mine used to say, and have very specific verification and testing parameters. Nothing as big as a generalized AI chip. State machine logic is probably the biggest pain in the butt in modern chip design, and if you get it wrong you're looking at an relatively expensive and time-consuming base layer stepping.
 
I use multiple AI tools for research. There are ups and downs but right now Grok does the best job. I do not pay though, hopefully they stay free for personal use. I do see money to be made with company specific LLMs. I know a bank that is using it for employee empowerment. Answers that took several minutes to find are now available in seconds. Some answers that could not be found are found in seconds. A serious productivity tool and it helps fight against fraud which is running rampant in banking.

And when a customer is waiting in front of you minutes seem like hours! And saying you will have to get back to them tomorrow with the answer will get you a bad survey. Apparently anything under 8 out of ten is bad.

FWIW - using Grok in the car while (supervised self-)driving is an interesting experience these days. I'm able to research some things (then verify when I get home) saving a lot of time, especially because you can ask questions while thoughts are flowing rather than following up later when you might be in a different frame of mind.

Also for Grok - if you start running into limits, there's an unofficial tier -- buy the X / Twitter blue check and you get more usage. That's cheaper (or at least was last time I looked) than paying the first tier of access on Grok alone.
 
GPUs will get replaced with dedicated AI accelerators. Inference will be the first thing to go. You already see Google do this but the trend will accelerate.

It is normal for application specific accelerators to be an order of magnitude more efficient than general purpose hardware.

What's more likely is that GPU will evolve into accelerators. One of the big problems building a 10x/100x accelerator right now is that the models keep improving and the model serving architecture for LLMs keeps on changing and getting more optimized. Very hard to create hyper optimized hardware when the underlying model and model serving structure keeps morphing. And anybody who suggests that Blackwell or Rubin is "just a GPU" clearly isn't paying attention to improvements in representation/math, like FP4 and NVFP4, plus optimizations in the core transformer / attention engines and disaggregated inference. If people had built fixed hardware 4 years ago based on the current transformers and data representation, they would delivered hardware missing out on these improvements. And then there is the sharability problem - it might be possible to build something fast that serves a single user at a time, but that's not so cost effective when one is trying to serve hundreds of users. Grok, Etched and Cerebras are all AI accelerators of sorts. All tout super fast token rates, but all do best when serving single user at a time. Cerebras does reasonably well in multi-user mode as well, but I'm more skeptical of the other two.
 
Back
Top