Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/cisco-launched-its-silicon-one-g300-ai-networking-chip-in-a-move-that-aims-to-compete-with-nvidia-and-broadcom.24521/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030871
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Cisco launched its Silicon One G300 AI networking chip in a move that aims to compete with Nvidia and Broadcom.

Daniel Nenni

Admin
Staff member
The Cisco Silicon One G300 was announced at Cisco Live EMEA and headlined a bevy of product announcements. Silicon One G300 powers Cisco's new N9000 and Cisco 8000 switches and offers 102.4 Tbps switching speeds in liquid cooled systems for hyperscalers, neoclouds, sovereign clouds and enterprises.

Cisco is surrounding its new AI networking chip and switches with Nexus One, a management platform that provides a unified fabric as well as native Splunk integration and AI job observability.

Jeetu Patel, President and Chief Product Officer at Cisco, said the company was focused on "innovating across the full stack - from silicon to systems and software."

According to Cisco, the network is critical to AI training and inference to get the most out of GPU investments. Silicon One G300 and the Cisco switches it powers are likely to compete with Nvidia's Mellanox InfiniBand and Spectrum-X Ethernet Photonics switch system as well as Broadcom's offerings such as its XPU and Tomahawk switches.

If you zoom out, Cisco's AI networking chips are another data point that Nvidia is getting a lot more competition. Custom silicon from the likes of AWS and Google Cloud and AMD offerings also indicate more choices other than Nvidia.

Key points about Silicon One G300 and the new switches from Cisco.

- Silicon One G300 has Intelligent Collective Networking, which features a shared packet buffer, path-based load balancing and proactive network telemetry.
- The AI networking chip responds to bursts, link failures and packet drops to deliver 33% increased network utilization.
- Silicon One G300 is programmable to be upgraded for new network functionality and has a unified architecture.
- Cisco N9000 and Cisco 8000 have 1.6T OSFP (Octal Small Form-factor Pluggable) Optics and 800G Linear Pluggable Optics (LPO) to halve power consumption.
- The company expanded its Silicon One P200-based systems for hyperscale deployments.

In addition to Cisco's AI silicon and new switches, the company also announced the following at Cisco Live EMEA.

Cisco Silicon One G300


Cisco upgraded AgenticOps with new platform additions to add autonomous actions and oversight. The company launched AgenticOps a year ago.

AgenticOps is upgraded with autonomous troubleshooting for campus, branch and industrial networks, context-aware optimization, trusted validation, experience metrics and workflow creation.

Cisco AI Defense, also launched a year ago, was upgraded with tools to secure the AI supply chain as well as AI agents.

The company said new features for Cisco AI Defense include AI Bill of Materials, which provides visibility into MCP servers and third parties; MCP Catalog, which inventories and manages risk across MCP servers; testing of models and agents in multiple languages; and real-time guardrails.

 
I suspect the future success of merchant datacenter network switch ASICs based on Ethernet, like this example, will be in decline, just like what's happening with merchant CPUs from Intel and AMD. Ethernet is a legacy network, based on technologies mostly from 20-30 years ago. It is being almost completely displaced in client deployments by 802.11 WIFI. While it still is the only practical solution for enterprise datacenters, as cloud computing continues to suck growth out of enterprise computing, the big cloud computing companies will likely be drawn to Google's and Nvidia's proprietary silicon path. The Ethernet industry is making a last stand for technology leadership with the Ultra Ethernet and UALink specifications, but they are hampered by the IEEE 802.1 and 802.3 committees, which are mired in tradition and compatibility with features from decades ago. And UE and UALink are still defined by multi-company committees, rather than tight, one-company internal teams with strong visionary leaders. Focused teams usually beat industry committees hampered by corporate politics.

I wonder, will an Arm-style IP company emerge for datacenter networking as Arm has for server CPUs? I can't believe it won't. Chiplets and improved chip design tools seem to make the problem more easily solvable.
 
I suspect the future success of merchant datacenter network switch ASICs based on Ethernet, like this example, will be in decline, just like what's happening with merchant CPUs from Intel and AMD. Ethernet is a legacy network, based on technologies mostly from 20-30 years ago. It is being almost completely displaced in client deployments by 802.11 WIFI. While it still is the only practical solution for enterprise datacenters, as cloud computing continues to suck growth out of enterprise computing, the big cloud computing companies will likely be drawn to Google's and Nvidia's proprietary silicon path. The Ethernet industry is making a last stand for technology leadership with the Ultra Ethernet and UALink specifications, but they are hampered by the IEEE 802.1 and 802.3 committees, which are mired in tradition and compatibility with features from decades ago. And UE and UALink are still defined by multi-company committees, rather than tight, one-company internal teams with strong visionary leaders. Focused teams usually beat industry committees hampered by corporate politics.

I wonder, will an Arm-style IP company emerge for datacenter networking as Arm has for server CPUs? I can't believe it won't. Chiplets and improved chip design tools seem to make the problem more easily solvable.
Ethernet has been like Moore's law, claimed to be dying every year, yet still being pushed forward. Wonder what's the % of deployment in cloud networking are on Ethernet. We have seen latest scale up/across to add Ethernet support.

Those IEEE committee has been a source of problem more than help. It feels like every successful innovation driving initial market massive deployment is done outside IEEE, or at least before it's fully endorsed. One typical example is PAM4.

as for networking IP, it's an interesting idea/angle. I would think one major issue there is unlike server chips, networking have to be hard IP, with close interaction with up and down stream vendors to make the system work.
 
Ethernet has been like Moore's law, claimed to be dying every year, yet still being pushed forward. Wonder what's the % of deployment in cloud networking are on Ethernet. We have seen latest scale up/across to add Ethernet support.
Anybody who has predicted Ethernet's death must have their nose in Jack Daniel's.

At this time, I'd estimate over 90% of cloud networking is still on Ethernet, in one form or another. The network interface cards/chips have become proprietary, but the switches and links (outside of Google) are still Ethernet. Nvidia has two essentially proprietary networks, NVLink and InfiniBand (IB is no longer an open industry spec, and Nvidia is the only commercial implementation, so I count it as proprietary), but they are also a substantial provider of Ethernet adapters and switches.

Nonetheless, if you read the description of the Cisco switch ASIC, unless they support the legacy Ethernet specifications (e.g. Spanning Tree Protocol), most of the top-line features are proprietary, and only work in networks with other Cisco switches of the same generation. The primary Ethernet features the Cisco ASIC supports are the PHY and Link layers. The current PHYs are great, but the 802.3 Link spec is a Jurassic mess. This is why Ultra Ethernet and UL Link were created, to work around the antiquated Ethernet specs.
Those IEEE committee has been a source of problem more than help. It feels like every successful innovation driving initial market massive deployment is done outside IEEE, or at least before it's fully endorsed. One typical example is PAM4.
Agreed.
as for networking IP, it's an interesting idea/angle. I would think one major issue there is unlike server chips, networking have to be hard IP, with close interaction with up and down stream vendors to make the system work.
I'm not so sure hard IP is a requirement for the big cloud companies. I don't know enough about their design flows to agree or disagree. I think they use soft IP for CPU cores and associated blocks (like memory controllers), why not for network switches?
 
I'm not so sure hard IP is a requirement for the big cloud companies. I don't know enough about their design flows to agree or disagree. I think they use soft IP for CPU cores and associated blocks (like memory controllers), why not for network switches?
I believe it has a lot to do with PHYs. There are control and algorithmic part of the networking that have had soft IP. But to meet the overall system ever increasing speed and bandwidth needs, you need to co-design more with PHY.
 
I believe it has a lot to do with PHYs. There are control and algorithmic part of the networking that have had soft IP. But to meet the overall system ever increasing speed and bandwidth needs, you need to co-design more with PHY.
I agree about PHYs; they are difficult at 200Gbps, but I think datacenter switch chips are now chiplet designs. I can't find a diagram for the G300, but I'd bet on it. I think the PHYs are a partitioned problem. I also think the G300 is one of the next generation designs that uses an in-package HBM for context storage.

One thing I'm convinced of, the G300 has a lot of features (and therefore wasted die area) a cloud company may not want. And it's a merchant chip, so there's marketing and product management involved, just like merchant CPUs. This look like a fertile field for the cloud vendors to do in-house designs in the future.
 
I just rewatched this Hot Chips tutorial on rack-based design again. Not clear where a switch like this would be used. Worth watching both part 1 and 2.


 
I just rewatched this Hot Chips tutorial on rack-based design again. Not clear where a switch like this would be used. Worth watching both part 1 and 2.


Ethernet/Ultra Ethernet are scale-out networks. UALink is trying to do an open scale-up network, but it may have only one major deployer: AMD. One thing to keep in mind, the videos focus on GPUs and various ASICs like TPUs. There are no high port count scale-up networks for CPUs. CPUs use cache coherency, GPUs don't; they do large-granularity memory sharing. Millions of cloud servers and enterprise servers still use CPUs, so scale-out networks are critical for CPU-based applications.
 
Ethernet/Ultra Ethernet are scale-out networks. UALink is trying to do an open scale-up network, but it may have only one major deployer: AMD. One thing to keep in mind, the videos focus on GPUs and various ASICs like TPUs. There are no high port count scale-up networks for CPUs. CPUs use cache coherency, GPUs don't; they do large-granularity memory sharing. Millions of cloud servers and enterprise servers still use CPUs, so scale-out networks are critical for CPU-based applications.
Scale-out isn't where all the focus is going any more, scale-across is.

And scale-up, obviously -- where the big shift to CPO is coming Real Soon Now... ;-)
 
Scale-out isn't where all the focus is going any more, scale-across is.
Not correct. Scale-across is just a distance-aware variation on existing interconnects, especially Ethernet. The biggest industry consortium, Ultra Ethernet, is targeted for scale-out, though I have no doubt they will also cover scale-across. Nonetheless, scale-out has never been where "all the focus is". Scale-up for GPUs has been important for years, and now has spawned UALink, though scale-up for CPUs have almost completely been proprietary, due to sensitivity to cache coherency schemes (which are also proprietary).
And scale-up, obviously -- where the big shift to CPO is coming Real Soon Now... ;-)
I assume you're talking about co-packaged optics, which is starting to happen now. Broadcom has delivered them in Tomahawk switches. So has Nvidia.
 
Not correct. Scale-across is just a distance-aware variation on existing interconnects, especially Ethernet. The biggest industry consortium, Ultra Ethernet, is targeted for scale-out, though I have no doubt they will also cover scale-across. Nonetheless, scale-out has never been where "all the focus is". Scale-up for GPUs has been important for years, and now has spawned UALink, though scale-up for CPUs have almost completely been proprietary, due to sensitivity to cache coherency schemes (which are also proprietary).

I assume you're talking about co-packaged optics, which is starting to happen now. Broadcom has delivered them in Tomahawk switches. So has Nvidia.
I did say "as well as scale-up"... ;-)

In terms of volume scale-up is the largest (and lowest cost/latency/power), scale-out is the smallest (and highest cost/latency/power), the newly-named scale-across (for distributed tightly-coupled DCs) is somewhere in between -- I meant that scale-across is now a more attractive market than scale-out, because it's going to be many times larger -- in the same way that DC scale-out is more attractive than traditional telecomms networking which is smaller volume and much slower to roll out, it's kind of become the ugly sister... ;-)

Scale-across also has different tradeoffs because low latency and power consumption are more important, it's really a distinct market segment -- and one that didn't exist even a couple of years ago... ;-)

(speaking as someone involved in developing products for all these markets, not just a commentator... ;-)

The comment about CPO was intended to point out that though there will undoubtedly be a rapidly expanding market for CPO, pluggable modules are going to be huge for many years yet, because there are many less-obvious commercial and system-level disadvantages to CPO (including lock-in and inflexibility) compared to pluggables as well as the obvious big advantages. Some customers may shift to CPO quickly, others will hand on to pluggables as long as possible -- it all depends where your priorities lie... :-)
 
Last edited:
I did say "as well as scale-up"... ;-)

In terms of volume scale-up is the largest (and lowest cost/latency/power), scale-out is the smallest (and highest cost/latency/power), the newly-named scale-across (for distributed tightly-coupled DCs) is somewhere in between -- I meant that scale-across is now a more attractive market than scale-out, because it's going to be many times larger -- in the same way that DC scale-out is more attractive than traditional telecomms networking which is smaller volume and much slower to roll out, it's kind of become the ugly sister... ;-)
In terms of scale-up, the market is far and away dominated by Nvidia NVLink. Nvidia doesn't break-out NVLink revenue in detail, but it's thought to be about 25% or less of Nvidia total networking revenue, which was $8.6B in the last fiscal quarter. ($8.6B is amazing, actually.) As for scale-across being "larger" than scale-out, that's seems highly unlikely. The entire Ethernet industry is currently scale-out.
Scale-across also has different tradeoffs because low latency and power consumption are more important, it's really a distinct market segment -- and one that didn't exist even a couple of years ago... ;-)
Scale-across has higher latency than high-performance scale-out networks.
(speaking as someone involved in developing products for all these markets, not just a commentator... ;-)
I worked in computer networking for part of my career, but I didn't like it. Industry networking hardware and software specifications are mostly not very good, and I learned to despise the spec development processes. (PHYs are an exception, but that's not my field.) Proprietary networking projects are still rare, but IMO always better.
The comment about CPO was intended to point out that though there will undoubtedly be a rapidly expanding market for CPO, pluggable modules are going to be huge for many years yet, because there are many less-obvious commercial and system-level disadvantages to CPO (including lock-in and inflexibility) compared to pluggables as well as the obvious big advantages. Some customers may shift to CPO quickly, others will hand on to pluggables as long as possible -- it all depends where your priorities lie... :-)
In my discussions, the power savings from CPO is so compelling that they are considered unavoidable.
 
In terms of scale-up, the market is far and away dominated by Nvidia NVLink. Nvidia doesn't break-out NVLink revenue in detail, but it's thought to be about 25% or less of Nvidia total networking revenue, which was $8.6B in the last fiscal quarter. ($8.6B is amazing, actually.) As for scale-across being "larger" than scale-out, that's seems highly unlikely. The entire Ethernet industry is currently scale-out.

Scale-across has higher latency than high-performance scale-out networks.


I worked in computer networking for part of my career, but I didn't like it. Industry networking hardware and software specifications are mostly not very good, and I learned to despise the spec development processes. (PHYs are an exception, but that's not my field.) Proprietary networking projects are still rare, but IMO always better.

In my discussions, the power savings from CPO is so compelling that they are considered unavoidable.
I didn't say scale-across was bigger *today*, but it will be by the time the products being developed today or in the near future come to market -- that's what the forecasts from the hyperscalars say.

You're confusing networks as built today with the way they'll be built in the near future, with low-latency products specifically targeted at the scale-across market.

The power savings from CPO are indeed attractive -- typically 50% or so on the networking component -- but though these power saving numbers look large at the rack level (kilowatts!) they're not completely compelling in the big picture -- this might reduce total power DC consumption by at most 10% (NPU/CPU/routing/switching power dominates), but the cost is that the power in the central hard-to-cool AI/NPU/switch component goes up so they're even harder to cool, plus you realistically have to fully equip all channels at manufacture, plus there are build/reliability/maintenance issues, plus you're locked-in to one optics vendor so no price competition, plus you can't mix and match optical pluggables (e.g. DD/coherent-lite/coherent) -- there are a lot of real-life/business issues, not just the "50% power saving".

So for some customers CPO is indeed attractive -- especially ones who are 100% responsible for their entire HW/SW/system stack like Nvidia -- but others who see more of the downsides are playing wait-and-see.

As time goes on and the technology matures there's also no doubt that CPO market penetration will increase especially for scale-up, but the projections are that it'll be several years before 50% of ports are CPO -- and by then these might look nothing like the CPO ports of today, which is another risk with going CPO... ;-)
 
Last edited:
I didn't say scale-out up was bigger *today*, but it will be by the time the products being developed today or in the near future come to market -- that's what the forecasts from the hyperscalars say.
I don't believe them. Scale-out networks are still a large component of AI and non-AI systems, and will be for the foreseeable future.
You're confusing networks as built today with the way they'll be built in the near future, with low-latency products specifically targeted at the scale-across market.
Not at all. Scale-across networks have inherently higher latencies than some proprietary (Google Jupiter) and industry spec scale-out networks (InfiniBand) due to distance latency and switching latency. (Ethernet has inherently higher port to port switching latency due to Ethernet's inefficient addressing structure.) Not to mention that deep frame buffers in Ethernet networks (to minimize frame drops on long port to port links) also cause higher latency. It probably is possible to design a specialized low-latency scale-across network, but probably not with Ethernet as currently defined.
The power savings from CPO are indeed attractive -- typically 50% or so on the networking component -- but though these power saving numbers look large at the rack level (kilowatts!) they're not completely compelling in the big picture -- this might reduce total power DC consumption by at most 10% (NPU/CPU/routing/switching power dominates), but the cost is that the power in the central hard-to-cool AI/NPU/switch component goes up so they're even harder to cool, plus you realistically have to fully equip all channels at manufacture, plus there are reliability/maintenance issues, plus you're locked-in to one optics vendor so no price competition, plus you can't mix and match optical pluggables (e.g. DD/coherent-lite/coherent) -- there are a lot of real-life/business issues, not just the "50% power saving".
Good points.
So for some customers CPO is indeed attractive -- especially ones who are 100% responsible for their entire HW/SW/system stack like Nvidia -- but others who see more of the downsides are playing wait-and-see.

As time goes on and the technology matures there's also no doubt that CPO market penetration will increase especially for scale-up, but the projections are that it'll be several years before 50% of ports are CPO -- and by then these might look nothing like the CPO ports of today, which is another risk with going CPO... ;-)
I'm not qualified to judge this topic.
 
I don't believe them. Scale-out networks are still a large component of AI and non-AI systems, and will be for the foreseeable future.

Not at all. Scale-across networks have inherently higher latencies than some proprietary (Google Jupiter) and industry spec scale-out networks (InfiniBand) due to distance latency and switching latency. (Ethernet has inherently higher port to port switching latency due to Ethernet's inefficient addressing structure.) Not to mention that deep frame buffers in Ethernet networks (to minimize frame drops on long port to port links) also cause higher latency. It probably is possible to design a specialized low-latency scale-across network, but probably not with Ethernet as currently defined.

Good points.



I'm not qualified to judge this topic.
Sorry, my error/misprint -- I meant to say "scale-across" not "scale-out", now corrected... :-)

I have seen the forecasts for the next few years about where optical networking volumes and priorities are going to go, and apart from the known and obvious scale-up problem in/between racks (which CPO is aimed at addressing, which it will over time but not instantly) "scale-across" is the next big and urgent priority, even more so than "scale-out".

This is driven by the fact that the hyperscalars simply can't build those absolutely enormous power-greedy data centers (huge issues with power, planning. cooling...) big enough and fast enough to keep up with the rapidly rising size needed for AI models, so they have no choice but to build smaller (but still massive!) virtual data centers using dis-aggregation across sites -- and the "scale-across" bandwidth needed to connect these together is *many* times higher than the traditional "scale-out" bandwidth for DCI.

They also have the need for low latency (and if possible lower power) which is why things like "coherent-lite" are being proposed -- and this probably means "scale-across" has to be integrated into the system differently than "scale-out" (see below).

Exactly what network protocol is built on top of this "scale-across" optical hardware doesn't really matter, because all it does is transport raw data streams -- it could be Ethernet or Infiniband or Jupiter or anything else, that's down to the ASICs like NPUs and switches that it connects to. But it's unlikely to be traditional Ethernet due to the issues you mentioned... ;-)
 
Back
Top