Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/will-the-wse-3-chip-from-cerebras-change-everything.23977/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030770
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Will the WSE-3 chip from Cerebras change everything

Arthur Hanson

Well-known member
With the power of four trillion transistors and power saved by not having to communicate with other chips in a rack, will this change the data center game and in how many ways?
 
I think there was a similar thread here a few weeks ago -- some good discussion there.

WSE-3 is an impressive (historical) technical achievement.

It appears to have the ability to change the game for AI data centers, though it's definitely not a general purpose chip, so not changing data centers overall. The biggest advantage I see is the potential for stellar power efficiency - a lot less energy wasted on interchip communications, and power efficiency (density) is limiting data center compute these days..
 
How much is Cerebras revenue? How many "Chips/wafers/systems" do they sell per year?

I consider it a super computer with trivial volume but I thought the same about the Intial AI GPUs (oops)
 
How much is Cerebras revenue?
Currently in the range of $70M/quarter.
How many "Chips/wafers/systems" do they sell per year?
Unknown.
I consider it a super computer with trivial volume but I thought the same about the Intial AI GPUs (oops)
I agree, it is a very specialized AI supercomputer. They seem to currently put a lot of effort into sovereign AI system development.

Just pure speculation on my part, but given Google's corporate fascination with leadership technology (e.g. having the world's first and only fully optical production data center interconnect network (including MEMS optical switches, called Jupiter) and their investments in quantum computing (like fabricating their own QC chips in-house in Santa Barbara), it wouldn't surprise me at all if they wanted to bring the world's only wafer-scale chip design group in-house, and acquired Cerebras.
 
The architecture and design concepts of the WSE series would be very familiar to Google or any other hyperscaler. Redundancy, a little overprovisioning, repairability- Google sees that in their own data centers all day every day. A heterogeneous mix of compute and memory, yep they understand that too. Could be a match made in heaven.

They will be interesting to watch, whether they get acquired or not.
 
With the power of four trillion transistors and power saved by not having to communicate with other chips in a rack, will this change the data center game and in how many ways?
They have a theoretical advantage for dense, low power compute, but also have some disadvantages in terms of cadence of new co-optimized rack/data center level hardware and software (taking advantage of the newest breakthroughs and optimizations) and in leveraging heterogenous chipsets - logic, dense memory stacks and photonics all want different processes that can't be done on a single wafer.

I'm watching real rack-level and larger benchmarks like this one with realistic user data center loading and a full Pareto curve for the key parameter - tokens per MW vs user interactivity (token rate per user), and cost per million tokens vs per user interactivity (token rate per user).


A Cerebras rack filled with two CS-3s uses only 50kW vs GB200 NVL 72 that runs at about 150kW peak, but haven't seen the Pareto curves for Cerebras with hundreds of users and queries.
 
Back
Top