Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/cisco-launched-its-silicon-one-g300-ai-networking-chip-in-a-move-that-aims-to-compete-with-nvidia-and-broadcom.24521/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030871
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Cisco launched its Silicon One G300 AI networking chip in a move that aims to compete with Nvidia and Broadcom.

And then there is Cerebras, where scale‑up is essentially “inside one wafer” (one CS system), and scale‑out is multiple wafers connected via SwarmX + MemoryX over Ethernet. For scale-out, Cerebras connects multiple CS systems using the SwarmX interconnect plus MemoryX servers in a broadcast‑reduce topology. SwarmX does broadcast of weights to many wafers and reduction of gradients back into MemoryX, so that many CS‑3s train one large model in data‑parallel fashion. CS‑3 supports scale‑out clusters of up to 2,048 CS‑3 systems, with low‑latency RDMA‑over‑Ethernet links carrying only activations/gradients between wafers while keeping the bulk of traffic on‑wafer.
 
And then there is Cerebras, where scale‑up is essentially “inside one wafer” (one CS system), and scale‑out is multiple wafers connected via SwarmX + MemoryX over Ethernet. For scale-out, Cerebras connects multiple CS systems using the SwarmX interconnect plus MemoryX servers in a broadcast‑reduce topology. SwarmX does broadcast of weights to many wafers and reduction of gradients back into MemoryX, so that many CS‑3s train one large model in data‑parallel fashion. CS‑3 supports scale‑out clusters of up to 2,048 CS‑3 systems, with low‑latency RDMA‑over‑Ethernet links carrying only activations/gradients between wafers while keeping the bulk of traffic on‑wafer.
curious how Cerebras handles large memory access. No matter how much SRAM they have on chips, it's no where near what HBM provides
 
curious how Cerebras handles large memory access. No matter how much SRAM they have on chips, it's no where near what HBM provides
Cerebras uses dedicated servers, called MemoryX servers, which are SwarmX fabric-connected to the WSE-3 nodes. The MemoryX configuration can include up to 1.2PB of shared memory storage, consisting of DDR5 and Flash tiers. There is 44GB of SRAM on each WSE-3, and the SRAM has far lower latency and fabric latency than any HBM.
 
Back
Top