Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/what-will-the-cost-effective-lifespan-of-ai-chips.22333/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

What will the cost-effective lifespan of AI chips?

Arthur Hanson

Well-known member
Any thoughts or comments on this appreciated. Will these chips/boards be recycled for lower end uses? Will there be an active market for chips as they fall from the leading edge? Will they be rendered uneconomic for other purposes? Will their effective lifespan shrink faster or slower based on reuse?
 
The unit of replacement for servers is typically a rack chassis.

The lifespan of conventional servers in large data centers is typically 4-6 years. AI servers will probably have longer service lives, because they are much more expensive due to higher capacity power supplies and cooling systems, and many have expensive liquid cooling systems. And usually more expensive networking. I find it difficult to believe these very expensive AI servers aren't going to be repurposed and amortized over a longer period than conventional air-cooled rack servers with general purpose CPUs.

If this theory is correct, I think it also extends the potential for Nvidia's architectural marketshare leadership, and makes it more difficult for companies with alternative AI processor architectures to enter the market. The differences between AI processors and their software stacks are far greater than the differences between general purpose CPUs.
 
Thanks to one of my lurker friends here for pointing out that with liquid-cooled racks replacing an individual server in a rack is a much more involved and specialized procedure than in air-cooled racks. He thinks that might make an entire rack the unit of replacement and potential refurbishment. If he's right (he usually is about data center deployment issues), the big cloud companies might be able to handle that sort of thing internally, but for conventional enterprise data centers it sounds like a big future customer services business for companies like Dell and SMC. I think that smell I'm sniffing is old-style mainframe-class service agreements coming back.
 
Thanks to one of my lurker friends here for pointing out that with liquid-cooled racks replacing an individual server in a rack is a much more involved and specialized procedure than in air-cooled racks. He thinks that might make an entire rack the unit of replacement and potential refurbishment. If he's right (he usually is about data center deployment issues), the big cloud companies might be able to handle that sort of thing internally, but for conventional enterprise data centers it sounds like a big future customer services business for companies like Dell and SMC. I think that smell I'm sniffing is old-style mainframe-class service agreements coming back.
Thanks for the valuable insights, I come from the distant past when IBM was the safe choice. They use to own a large piece of the market at the time. It shows even a Titan can fall behind.
 
99% of "outsourced computing" industry ("cloud", datacentre rental, storage, CDN, etc, hosting) is plain web hosting. The 1% of the remaining 1% is the actual computing. It's very rarely that a long term computation consumer can rent CPU time cheaper than he can buy it.
 
99% of "outsourced computing" industry ("cloud", datacentre rental, storage, CDN, etc, hosting) is plain web hosting. The 1% of the remaining 1% is the actual computing. It's very rarely that a long term computation consumer can rent CPU time cheaper than he can buy it.
Where did you get these statistics?
 
I think it's self evident that the prime majority of server market is for the most trivial tasks. And the real HPC is extremely tiny
This response is silly, and all it takes is a few internet searches to find that AWS, Azure, and Google all have numerous customers doing HPC applications in the cloud. HPC systems have historically been greater than 10% of the revenue of the entire server market. I don't know what you mean by "real HPC"... government lab supercomputers? HPC applications are used by pharma companies, oil and gas extraction companies, chemical companies, meteorology, genomics companies, CAD for things like airplanes and weapons systems, to name a few. The well-known TOP500 supercomputer list represents many millions of CPU chips, memory chips, and storage devices. That list alone is not "extremely tiny", and the TOP500 is only a fraction of the entire supercomputing market.

As for servers being used for mostly trivial tasks, why do you think database applications in Oracle Cloud, AWS (AWS has the widest selection of databases available I'm aware of), Snowflake, Teradata Cloud, and Google Cloud are so popular? Including in-memory databases? What kinds of applications do you think such a large universe of expensive database systems are used for? And then there's NFS systems. Are cloud-based email servers like Google Mail and Outlook supporting hundreds of million of users trivial?

I won't even start with the worldwide market for streaming services, and how complex those applications are. I could go on and on, but I'm getting tired of responding to a silly notion.

The only thing that I think is self-evident in your post is that you don't know what you're talking about.
 
The well-known TOP500 supercomputer list represents many millions of CPU chips, memory chips, and storage devices.

Yes many millions, and count how many million chips AMD+Intel sell to the general purpose server market every month.
 
The unit of replacement for servers is typically a rack chassis.

The lifespan of conventional servers in large data centers is typically 4-6 years.

Peanut Gallery -- agree with typically; though occasionally a super disruptor comes along. Nehalem was good enough that a few corporate datacenters I knew of used it to replaced 2 year old equipment.

Though I think there's a small chance Nvidia or someone could release similarly disruptive hardware in the future, it's less likely to trigger mass replacement/retirement for reasons you outlined. I'd also add that higher end AI equipment has been getting pricier, not cheaper, unlike some times in the x86 datacenter past.
 
Peanut Gallery -- agree with typically; though occasionally a super disruptor comes along. Nehalem was good enough that a few corporate datacenters I knew of used it to replaced 2 year old equipment.
Averages are like that. ;) There are always outliers.
Though I think there's a small chance Nvidia or someone could release similarly disruptive hardware in the future, it's less likely to trigger mass replacement/retirement for reasons you outlined. I'd also add that higher end AI equipment has been getting pricier, not cheaper, unlike some times in the x86 datacenter past.
Agree, that's why I'm skeptical of estimated lifetimes for liquid-cooled GPU servers.
 
Last edited:
Yes many millions, and count how many million chips AMD+Intel sell to the general purpose server market every month.
Isn't it great when someone who doesn't know what they're talking about doubles down on their fallacies?

I suspect you're thinking about client CPU volumes, which are hundreds of millions per year.
 
Last edited:
And there's probably 100,000-150,000 Arm server CPUs deployed per month from Ampere, AWS Gravitons, and whatever Azure is deploying, but these are very rough guesses.
Unfortunately, I think I've overestimated data center Arm CPU volumes, since Ampere's 2024 revenue was only $16.4 million dollars.


I'd read AWS has deployed over two million Gravitons, so perhaps I've under-estimated their unit volume, but 100K units per month still sounds high. I was thinking Oracle has deployed more Ampere CPUs than is apparently reality.

SoftBank seems to be paying a lot for a very small customer base.
 
Intel sold about 800K Xeons in 2024
I don't have access to the raw data from Mercury Research, but NextPlatform published a chart in 11/2024 based on Mercury Research data, which showed Intel shipped about four million Xeons in 3Q24, which is about 1.3M units per month.


The x86 server CPU market still looks lucrative. The biggest threat to x86 server volumes still looks like custom Arm CPUs designed for internal use by the big cloud companies. And that's where most of the growth is too, so I continue to be pessimistic about x86 server CPU volumes in the long run.
 
I don't have access to the raw data from Mercury Research, but NextPlatform published a chart in 11/2024 based on Mercury Research data, which showed Intel shipped about four million Xeons in 3Q24, which is about 1.3M units per month.


The x86 server CPU market still looks lucrative. The biggest threat to x86 server volumes still looks like custom Arm CPUs designed for internal use by the big cloud companies. And that's where most of the growth is too, so I continue to be pessimistic about x86 server CPU volumes in the long run.
The only way to undercut is sell better CPU at cheaper price than hyperscaler can make them

Thanks for the data I remember somewhere it was 800K maybe I was wrong
 
The only way to undercut is sell better CPU at cheaper price than hyperscaler can make them
I don't think that's possible. A merchant CPU product will always include a broader feature set to attract more customers, which adds to R&D cost and die area. Figuring out the right feature set takes expensive product managers. Then there's marketing and sales. The cloud companies get requirements from the system architects and software engineers, so no product managers, no marketing, no sales. I came to the conclusion a long time ago that merchant CPU vendors can't compete with in-house CPU designs, so long as the in-house design generates enough unit volume to get reasonably low per unit fabrication, packaging, and testing prices.
Thanks for the data I remember somewhere it was 800K maybe I was wrong
I'd bet a dollar or two that you were looking at this chart and misinterpreted the 0.8 value:

1742525032743.png
 
I don't think that's possible. A merchant CPU product will always include a broader feature set to attract more customers, which adds to R&D cost and die area. Figuring out the right feature set takes expensive product managers. Then there's marketing and sales. The cloud companies get requirements from the system architects and software engineers, so no product managers, no marketing, no sales. I came to the conclusion a long time ago that merchant CPU vendors can't compete with in-house CPU designs, so long as the in-house design generates enough unit volume to get reasonably low per unit fabrication, packaging, and testing prices.

View attachment 2899
But the Cloud companies in the end will design on base of their Customers I think Custom x86 CHIPS will become more popular with hyperscalers
You are right for the image I mistook that as 800K.

This is true with AMD but for Intel I don't understand they have fabs to fill they can ask hyperscaler to buy x86 or ask them to fabricate on their process
 
Last edited:
Back
Top