Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/nvidia-reportedly-delays-its-next-ai-chip-due-to-a-design-flaw.20725/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Nvidia reportedly delays its next AI chip due to a design flaw

New news stories, stemming from a SemiAnalysis report, are pointing to teething pains with CoWoS-L

“The main issue behind these delays to GPU shipments is related to Nvidia's physical design of the Blackwell family, according to a report from semiconductor research firm SemiAnalysis. Specifically, Blackwell is the first high volume design to use the CoWoS-L packaging technology from TSMC, Nvidia's chip manufacturer.”


Not a reliable source. I hope he eats this one.
 
Three months late does not sound like a big deal. The only companies with competitive GPU parts are AMD and Intel, and I wonder... even if these two can get the extra chips, can they get them packaged? And then, can the customers' software run on anything but Nvidia GPUs without modifications, and will the software ports take more time to resolve than waiting three extra months?
H100 still sells like hot cakes, so even if the next product come online tmr, they still not getting any shares away.
 
Not a reliable source. I hope he eats this one
The Register is definitely a little iffy. I think the underlying report by SemiAnalysis is much more credible and in-depth.


Throw in the in-depth "shade" YouTube by Cerebras and I mark this one as highly-likely-true.
 
According to The Info, the design flaw is with the processor die that connects the Blackwell GPUs on a single NVIDIA GB200 Superchip. Nvidia is now working on redesigning the die and will likely need a few months before it can move to production testing with TSMC. :unsure::unsure::unsure::unsure::unsure:
acastro_180529_1777_nvidia_0002.0.jpg

Illustration by Alex Castro / The Verge

Nvidia has reportedly told Microsoft and at least one other cloud provider that its “Blackwell” B200 AI chips will take at least three months longer to produce than was planned, according to The Information. The delay is the result of a design flaw discovered “unusually late in the production process,” according to two unnamed sources, including a Microsoft employee, cited by the outlet.

B200 chips are the follow-up to the supremely popular and hard-to-get H100 chips that power vast swaths of the artificial intelligence cloud landscape (and helped make Nvidia one of the most valuable companies in the world). Nvidia expects production of the chip “to ramp in 2H,” according to a statement that Nvidia spokesperson John Rizzo shared with The Verge. “Beyond that, we don’t comment on rumors.”

Nvidia is now reportedly working through a fresh set of test runs with chip producer Taiwan Semiconductor Manufacturing Company, and won’t ship large numbers of Blackwell chips until the first quarter. The Information writes that Microsoft, Google, and Meta, have ordered “tens of billions of dollars” worth of the chips.

The report comes just months after Nvidia said that “Blackwell-based products will be available from partners” starting in 2024. The new chips are supposed to kick off a new yearly cadence of AI chips from the company as several other tech firms, such as AMD, work to spin up their own AI chip competitors.

 
Cerebras does support PyTorch according to their website, and they describe how to port a trained model to Hugging Face, but I haven't seen any testimonials from groups that have transitioned successfully to Cerebras system from a GPU-based system. Yet.
Do we (ie the NDA-free public) even know where Cerebras is being used? Presumably they are selling and making money, but no-one ever talks about them in the context of the usual hyperscalers or academia.
So where? Military? Finance? Very specialist AI business stuff like oil exploration, or drug discovery?
 
Do we (ie the NDA-free public) even know where Cerebras is being used? Presumably they are selling and making money, but no-one ever talks about them in the context of the usual hyperscalers or academia.
So where? Military? Finance? Very specialist AI business stuff like oil exploration, or drug discovery?
Cerebras has various customer / deployment articles on their website. For example, from the Mayo Clinic:

 
So where? Military? Finance? Very specialist AI business stuff like oil exploration, or drug discovery?
Via Perplexity AI - but ignores the biggest deal with cloud service supplier G42.

Cerebras Systems has established a diverse customer base across various sectors, including pharmaceuticals, energy, supercomputing, and healthcare. Key customers include:

- **Pharmaceutical and Biotech Companies**: Notable clients such as **AstraZeneca** and **GSK** utilize Cerebras technology to accelerate AI model training, significantly reducing computation time compared to traditional GPU clusters. For instance, AstraZeneca reported that a task taking two weeks on GPUs was completed in just a few days using Cerebras' systems[2][5].

- **Energy Sector**: **TotalEnergies** is one of the first publicly disclosed customers in this sector, employing Cerebras' CS-2 system for advanced computing tasks[1].

- **Supercomputing and Research Institutions**: Cerebras has partnered with several high-performance computing centers, including the **National Center for Supercomputing Applications (NCSA)**, **Argonne National Laboratory**, and the **Leibniz Supercomputing Centre**. These institutions leverage Cerebras' technology for complex simulations and AI model training[1][2][3].

- **Healthcare Institutions**: Recently, the **Mayo Clinic** selected Cerebras as a lead partner for developing large language models tailored for medical applications, utilizing their extensive health data sets[5].

- **Universities and Research Organizations**: The **University of Edinburgh** and other academic institutions are also among Cerebras' customers, focusing on innovative research and AI applications[3].

Overall, Cerebras' technology is being adopted globally, with clients in North America, Europe, Asia, and the Middle East, reflecting its growing influence in AI and deep learning applications[2].

Citations:
[1] https://en.wikipedia.org/wiki/Cerebras
[2] https://venturebeat.com/data-infras...largest-ai-models-ever-trained-on-one-device/
[3] https://www.cbinsights.com/company/cerebras-systems/customers
[4] https://6sense.com/tech/artificial-intelligence/cerebras-market-share
[5] https://www.forbes.com/sites/karlfr...tems-wins-a-major-new-client-the-mayo-clinic/
[6] https://www.forbes.com/sites/karlfr...ith-qualcomm-launches-3rd-gen-wafer-scale-ai/
[7] https://www.cerebras.net/cerebras-customer-spotlight-overview/
[8] https://www.cerebras.net
 
Back
Top