Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/samsung-foundry-nabs-nvidia.24757/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030970
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Samsung Foundry nabs Nvidia

The dramatic reduction in projected revenue suggests Groq may be facing challenges in securing data center space as it works to sell its hardware to large companies and foreign governments.
The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.
My take is that Groq only had a partial solution for data-center scale inference. The Groq chips are great at simple, fast , low latency decode (MoE execution), but don’t have enough memory or the right memory management (KV store and caches) for optimized long-context prefill, where raw memory bandwidth isn’t as important. Plus they lacked rack/pod level resource management, routing, and memory tiering orchestration to make it all work efficiently.
But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)
Add in that they had to spin up and operate their own data centers to sell their solutions since there weren’t any CSP / hyperscaler takers for their raw boards / racks without trialing on real hardware. I don’t think they directly ran into a data centers power / capacity issue, but did hit a cost of operations vs revenue wall.
Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.
 
The question is, to me, is any start-up chip designer successful in the AI market who is not partnered with a major player (cloud computing company or top tier chip maker)? I can't think of any. While a couple of my friends point to Cerebras, they are a remarkable technical success, but revenue-wise they are still well under $500M, and losing money.

I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.



But inference is where Groq has been aiming all along, correct? Except for Cerebras, are any start-ups competing in training anymore? (I can't think of any.)

I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.

Building an inference solution on Groq, or any start-up, is a risky proposition that they'll fold from lack of funding, and you'll have wasted precious time to market. Now that Groq has the backing of Nvidia, I think they could become the frontrunner. Not because they're so awesome, but because of Nvidia. Then there's SambaNova and Intel. I think there's hope for Samba-tel too.

My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.
 
I would put Cerebras in the path to success category, but they still might need the major partnering tie-in, like Amazon, to go the distance. Other than that, I think you are right, with two caveats:
* There's likely still opportunity on the client side. It's not clear that Apple, Intel and AMD have gotten their XPUs right for real-world client side applications, and there are more specialized apps like autonomous driving, etc. where chips optimized for something completely different (al la AI5, AI6) are required.
Agreed on Cerebras and Amazon. Cerebras really gets a win if Google and Azure feel they need to follow suit.

As for clients... I'm confused by what I see happening on Windows PCs. Intel, AMD, and Qualcomm have incompatible NPUs. Microsoft supports all three, but this looks like a silly plan, and gives Apple an advantage. If I were Microsoft I'd be pushing for a common instruction set NPU spec.
* China - who knows what's going to evolve as the home-grown data center inference win there. Probably still ties in with Huawei or a hyperscaler/CSPs like Alibaba. I do find it interesting that Alibaba has been doing some major contributions with respect to Mooncake to NVIDIA's open source "OS" for racks/pods, Dynamo.
China remains a mystery to me.
I can't, though again, China is the unknown. There isn't much space given TPU, Trainium 3 and AMD are already filling up the rest of the training space.
Agreed.
My guess is that there will be fallout and specialization in data-center level rack/pod level AI inference hardware over the next couple years. There's just so much that has to come together for the co-optimized general solution - processor systems (2 or more different types), connectivity, storage tiers, orchestration OS, all the different model serving environments, plus the physical racks with extreme power and cooling considerations. And all the hardware components require new chips that are not general purpose and have to be co-optimized together - so we're back to a vertical focus like IBM mainframes, at least for while.
And this is the case for Nvidia continuing to win big. They already have the co-optimization, even in software. Only Google has a comparable solution, but I suspect it only works in Google Cloud (regardless of the Broadcom Anthropic stories we continue to read).
 
Back
Top