Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/microsoft-cto-says-he-wants-to-swap-most-amd-and-nvidia-gpus-for-homemade-chips.23735/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030770
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Microsoft CTO says he wants to swap most AMD and Nvidia GPUs for homemade chips

soAsian

Well-known member
1759547054478.png


Pivot will hinge on success of next-gen Maia accelerator[/HEADING]

Thu 2 Oct 2025 // 20:41 UTC

Microsoft buys a lot of GPUs from both Nvidia and AMD. But moving forward, Redmond's leaders want to shift the majority of its AI workloads from GPUs to its own homegrown accelerators.

The software titan is rather late to the custom silicon party. While Amazon and Google have been building custom CPUs and AI accelerators for years, Microsoft only revealed its Maia AI accelerators in late 2023.

Driving the transition is a focus on performance per dollar, which for a hyperscale cloud provider is arguably the only metric that really matters. Speaking during a fireside chat moderated by CNBC on Wednesday, Microsoft CTO Kevin Scott said that up to this point, Nvidia has offered the best price-performance, but he's willing to entertain anything in order to meet demand.

Going forward, Scott suggested Microsoft hopes to use its homegrown chips for the majority of its datacenter workloads.

When asked, "Is the longer term idea to have mainly Microsoft silicon in the data center?" Scott responded, "Yeah, absolutely."

Later, he told CNBC, "It's about the entire system design. It's the networks and cooling, and you want to be able to have the freedom to make decisions that you need to make in order to really optimize your compute for the workload."

With its first in-house AI accelerator, the Maia 100, Microsoft was able to free up GPU capacity by shifting OpenAI's GPT-3.5 to its own silicon back in 2023. However, with just 800 teraFLOPS of BF16 performance, 64GB of HBM2e, and 1.8TB/s of memory bandwidth, the chip fell well short of competing GPUs from Nvidia and AMD.

Microsoft is reportedly in the process of bringing a second-generation Maia accelerator to market next year that will no doubt offer more competitive compute, memory, and interconnect performance.

But while we may see a change in the mix of GPUs to AI ASICs in Microsoft data centers moving forward, they're unlikely to replace Nvidia and AMD's chips entirely.

Over the past few years, Google and Amazon have deployed tens of thousands of their TPUs and Trainium accelerators. While these chips have helped them secure some high-profile customer wins, Anthropic for example, these chips are more often used to accelerate the company's own in-house workloads.

As such, we continue to see large-scale Nvidia and AMD GPU deployments on these cloud platforms, in part because customers still want them.

It should be noted that AI accelerators aren't the only custom chips Microsoft has been working on. Redmond also has its own CPU called Cobalt and a whole host of platform security silicon designed to accelerate cryptography and safeguard key exchanges across its vast datacenter domains. ®

 
Last edited by a moderator:
I think it's a worthwhile strategy, and Microsoft is trailing Google and Amazon in implementing it. If your real objective is best price-performance, not necessarily best performance, and your volumes are high enough to offset the costs, the results can be very profitable. Especially for in-house applications. Meta just bought Rivos for just this reason.
 
He can dream 🤣 🤣
I think there are two very fundamental problems with their thinking per his statement in the article:

Later, he told CNBC, "It's about the entire system design. It's the networks and cooling, and you want to be able to have the freedom to make decisions that you need to make in order to really optimize your compute for the workload."

* Microsoft isn't in a position where they can build customized chips for all the needed optimized element for the entire system - some parts yes, but nowhere near what would likely be required.
* How much of their data center pools of workloads is actually theirs, vs OpenAI's vs others.

We'll see how things play out, but I'm a skeptical about in-house chips for every hyperscaler / CSP, especially when they have a huge pools of "legacy workloads" and data centers.
 
I think there are two very fundamental problems with their thinking per his statement in the article:

Later, he told CNBC, "It's about the entire system design. It's the networks and cooling, and you want to be able to have the freedom to make decisions that you need to make in order to really optimize your compute for the workload."

* Microsoft isn't in a position where they can build customized chips for all the needed optimized element for the entire system - some parts yes, but nowhere near what would likely be required.
* How much of their data center pools of workloads is actually theirs, vs OpenAI's vs others.

We'll see how things play out, but I'm a skeptical about in-house chips for every hyperscaler / CSP, especially when they have a huge pools of "legacy workloads" and data centers.
Azure is the #2 cloud computing company in the world. Their proprietary cloud-native applications include six database management systems (and Oracle on Azure, which uses Azure storage), four distributed storage systems I'm aware of (including block, file, and multiple blob stores), and all of the Microsoft applications are available in cloud-native form, like the entire Outlook suite, Teams, Office... Microsoft's internal cloud-native application environment is huge. I would guess millions of server CPUs are consumed running all of these applications. And the CPUs and their entire software stacks are completely invisible to the application users. Just like AWS. Just like Google Cloud, but Azure is bigger, and Microsoft apps are considered more enterprise level than Google's.

Unlike Amazon and Google, Microsoft has resisted getting into the chip design business. When they wanted more control and offload of their network processing, they tried a custom superNIC based on FPGAs to avoid chip design:


Microsoft likes to brag about the FPGA NIC, but then they acquired Fungible in 2023 for $190M. That looks like a course correction to me. Fungible designs their own NIC chip with MIPS cores and other supporting accelerators, so the FPGA stuff obviously didn't cut it for either cost (likely), performance (FPGAs are slow compared to ASICs), or power (FPGAs are expensive and consume more power than ASICs). Even the expensive FPGA NIC went into more than 1M servers.

IMO, Microsoft is just behind the curves of AWS and Google on cloud server chips, and needs in-house designs to compete on cost.

Even Oracle invests in its own cloud CPU designs; they just did it with Ampere. Larry Ellison also had the chip design insight earliest, when Oracle acquired Sun in 2009. That didn't work out as well as he anticipated, but Ellison is often ahead of the curve, and that entails more risk.

I think bringing cloud server chip design in-house for CPUs and networking is probably the biggest threat to Intel's product groups' gross margin, and one big reason why I'm a fan of Intel getting its foundry act together. In foundry they're really one of two, maybe one of three if you count Samsung, which looks like a stretch. I think growth in the highly profitable merchant server chip market will be slower in the future, unless you're Nvidia. Very likely considerably slower. There's a loud sucking sound from enterprise data centers to the cloud.
 
Last edited:
IMO, Microsoft is just behind the curves of AWS and Google on cloud server chips, and needs in-house designs to compete on cost.

Even Oracle invests in its own cloud CPU designs; they just did it with Ampere. Larry Ellison also had the chip design insight earliest, when Oracle acquired Sun in 2009. That didn't work out as well as he anticipated, but Ellison is often ahead of the curve, and that entails more risk.

I think bringing cloud server chip design in-house for CPUs and networking is probably the biggest threat to Intel's product groups' gross margin,

I think you nailed it. Microsoft in-house workloads are historic CPU-centric workloads, not AI, which requires entire rethinks of what a data center is. They, like Oracle, Nebius and CoreWeave are also building completely new AI data centers. So chips and optimized systems to accelerate Microsoft workloads really threaten Intel/AMD more than NVIDIA. AI data centers look more like this:




 
Last edited:
I think you nailed it. Microsoft in-house workloads are historic CPU-centric workloads, not AI, which requires entire rethinks of what at data center is. They, like Oracle, Nebius and CoreWeave are also building completely new AI data centers. So chips and optimized systems to accelerate Microsoft workloads really threaten Intel/AMD more than NVIDIA. AI data centers look more like this:




I suspected that Nvidia GPUs and NVLink are the solution for the new datacenter, because Azure's highly touted AI chip, Maia, isn't ready yet, or the software stack isn't. Whatever, that new datacenter in your link is all Nvidia. So I wonder how Maia fits in the AI strategy? Only for Azure applications, but not for world domination? More than a bit confusing to me.
 
Back
Top