Samsung Foundry nabs Nvidia

blueone · Mar 18, 2026

Fred Chen said:
The dramatic reduction in projected revenue suggests Groq may be facing challenges in securing data center space as it works to sell its hardware to large companies and foreign governments.

The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.

KevinK said:
My take is that Groq only had a partial solution for data-center scale inference. The Groq chips are great at simple, fast , low latency decode (MoE execution), but don’t have enough memory or the right memory management (KV store and caches) for optimized long-context prefill, where raw memory bandwidth isn’t as important. Plus they lacked rack/pod level resource management, routing, and memory tiering orchestration to make it all work efficiently.

But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)

KevinK said:
Add in that they had to spin up and operate their own data centers to sell their solutions since there weren’t any CSP / hyperscaler takers for their raw boards / racks without trialing on real hardware. I don’t think they directly ran into a data centers power / capacity issue, but did hit a cost of operations vs revenue wall.

Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.

KevinK · Mar 18, 2026

blueone said:
The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.

I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.

Building a Production-Grade Cloud-Native Large Model Inference Platform with SGlang RBG + Mooncake

This article shows how SGLang RBG + Mooncake enable production-grade, cloud-native LLM inference with PD-disaggregation.

www.alibabacloud.com

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale | NVIDIA Technical Blog

Reasoning models are growing rapidly in size and are increasingly being integrated into agentic AI workflows that interact with other models and external tools. Deploying these models and workflows in…

developer.nvidia.com

blueone said:
But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)

I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.

blueone said:
Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.

My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.

blueone · Mar 18, 2026

KevinK said:
I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.

Agreed on Cerebras and Amazon. Cerebras really gets a win if Google and Azure feel they need to follow suit.

As for clients... I'm confused by what I see happening on Windows PCs. Intel, AMD, and Qualcomm have incompatible NPUs. Microsoft supports all three, but this looks like a silly plan, and gives Apple an advantage. If I were Microsoft I'd be pushing for a common instruction set NPU spec.

KevinK said:
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.

China remains a mystery to me.

KevinK said:
I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.

Agreed.

KevinK said:
My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.

And this is the case for Nvidia continuing to win big. They already have the co-optimization, even in software. Only Google has a comparable solution, but I suspect it only works in Google Cloud (regardless of the Broadcom Anthropic stories we continue to read).

siliconbruh999 · Mar 19, 2026

blueone said:
As for clients... I'm confused by what I see happening on Windows PCs. Intel, AMD, and Qualcomm have incompatible NPUs. Microsoft supports all three, but this looks like a silly plan, and gives Apple an advantage. If I were Microsoft I'd be pushing for a common instruction set NPU spec.

It's too much work to create a single ISA for an ASIC shared by Vendors what would be difficult and Microsoft don't have the capability to do so. what would be better is DirectX like API so each HW has to support like a fixed amount of operation which can be than mapped to the API and the developers can use the API to program stuff.
If it were me I would cut the NPU to minimum and don't want it over a particular Compute Capacity

blueone · Mar 19, 2026

siliconbruh999 said:
It's too much work to create a single ISA for an ASIC shared by Vendors what would be difficult and Microsoft don't have the capability to do so. what would be better is DirectX like API so each HW has to support like a fixed amount of operation which can be than mapped to the API and the developers can use the API to program stuff.
If it were me I would cut the NPU to minimum and don't want it over a particular Compute Capacity

I think an API is too high level for NPU processing. I suspect that for Qualcomm's NPUs for the Surface PCs, Microsoft worked closely with Qualcomm, because Microsoft's name is on the Surface. I am wondering how Microsoft supports Intel and AMD. Somewhere some team is generating detailed requirements and specs for programming NPUs, and some other team is writing NPU code, which I'm assuming is in assembler. The assembler code is probably in libraries, but the libraries would need common functionality and parameters to make them useful. I'm wondering... who writes the NPU code and to whose specs for Windows? For more specialized libraries, like VINO, those coders are more likely in Intel and AMD. Apple claims FaceID and various other image processing software use the NPUs, but I'm wondering... are NPUs more for marketing than functionality at this point?

Apple's client AI strategy is considered weak for the time being. Copilot probably uses Qualcomm NPUs, but does Copilot really use Intel and AMD NPUs for anything significant? Microsoft's web page says mostly nothing specific.

How the NPU is paving the way toward a more intelligent Windows - Source

Microsoft's neural processing unit (NPU) powers Copilot+ PCs with advanced AI capabilities, enabling efficient on-device processing.

news.microsoft.com

Coming soon to a new marketing campaign...

siliconbruh999 · Mar 19, 2026

blueone said:
Apple's client AI strategy is considered weak for the time being. Copilot probably uses Qualcomm NPUs, but does Copilot really use Intel and AMD NPUs for anything significant? Microsoft's web page says mostly nothing specific.

it's artificial restriction on Microsoft part to Upsell you windows on ARM you can get Intel NPU to run AI models with intel AI playground it's capable of all the stuff that Qcom NPU is capable of.

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU. - intel/AI-Playground

github.com

blueone · Mar 19, 2026

siliconbruh999 said:
it's artificial restriction on Microsoft part to Upsell you windows on ARM you can get Intel NPU to run AI models with intel AI playground it's capable of all the stuff that Qcom NPU is capable of.

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU. - intel/AI-Playground

github.com

I can't believe Microsoft is that silly, but I've often suspected that Windows is still haunted by the ghost of Steve Ballmer, and it's really Office and Azure that keeps Microsoft relevant.

siliconbruh999 · Mar 20, 2026

blueone said:
I can't believe Microsoft is that silly, but I've often suspected that Windows is still haunted by the ghost of Steve Ballmer, and it's really Office and Azure that keeps Microsoft relevant.

Microsoft is dumb when it comes to windows they are pushing WoA for advantage that it doesn't provide rather than fixing the problem on windows i don't know anyone who thinks Co Pilot is good thing and the way Microsoft is forcing this stuff i mainly blame Google and Meta for this since Microsoft is trying to copy them in data collection.

Search

Samsung Foundry nabs Nvidia

blueone

Well-known member

KevinK

Well-known member

Building a Production-Grade Cloud-Native Large Model Inference Platform with SGlang RBG + Mooncake

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale | NVIDIA Technical Blog

blueone

Well-known member

siliconbruh999

Well-known member

blueone

Well-known member

How the NPU is paving the way toward a more intelligent Windows - Source

siliconbruh999

Well-known member

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

blueone

Well-known member

GitHub - intel/AI-Playground: AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.

siliconbruh999

Well-known member