WP_Term Object
(
    [term_id] => 13
    [name] => Arm
    [slug] => arm
    [term_group] => 0
    [term_taxonomy_id] => 13
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 398
    [filter] => raw
    [cat_ID] => 13
    [category_count] => 398
    [category_description] => 
    [cat_name] => Arm
    [category_nicename] => arm
    [category_parent] => 178
)

WP_Term Object
(
    [term_id] => 13
    [name] => Arm
    [slug] => arm
    [term_group] => 0
    [term_taxonomy_id] => 13
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 398
    [filter] => raw
    [cat_ID] => 13
    [category_count] => 398
    [category_description] => 
    [cat_name] => Arm
    [category_nicename] => arm
    [category_parent] => 178
)

August 30, 2023September 10, 2023 by Bernard Murphy

Arm Inches Up the Infrastructure Value Chain

Arm Inches Up the Infrastructure Value Chain
by Bernard Murphy on 08-30-2023 at 6:00 am
Categories: AI, Arm, IP, TSMC

Arm just revealed at HotChips their compute subsystems (CSS) direction led by CSS N2. The intent behind CSS is to provide pre-integrated, optimized and validated subsystems to accelerate time to market for infrastructure system builders. Think HPC servers, wireless infrastructure, big edge systems for industry, city, enterprise automation. This for me answers how Arm can add more value to system developers without becoming a chip company. They know their technology better than anyone else; by providing pre-designed, optimized and validated subsytems – cores, coherent interconnect, interrupt, memory management and I/O interfaces, together with SystemReady validation – they can chop a big chunk out of the total system development cycle.

Accelerating Custom Silicon

A completely custom design around core, interconnect, and other IPs obviously provides maximum flexibility and ability to differentiate but at a cost. That cost isn’t only in development but also in time to deployment. Time is becoming a very critical factor in fast moving markets – just look at AI and the changes it is driving in hyperscaler datacenters. I have to believe current economic uncertainties compound these concerns.

Those pressures are likely forcing an emphasis on differentiating only where essential and standardizing everywhere else, especially when proven experts can take care of a big core component. CSS provides a very standard yet configurable subsystem for many-core compute, include N2 cores (in this case), the coherent mesh network between those cores, together with interrupt and memory management, cache hierarchy, chiplet support through UCIe or custom interfaces, DDR5/LPDDR5 external memory interface, PCIe/CXL Gen5 for fast IO and or coherent IO, expansion IO, and system management.

All PPA optimized for an advanced 5nm TSMC process and proven SystemReady® with a reference software stack. The system developer still has plenty of scope for differentiation through added accelerators, specialized compute, their own power management, etc.

Neoverse V2

Arm also announced a next step in the Neoverse V-series, unsurprisingly improved over the V1 version with improved integer performance and reduction in system level cache misses. There is improvement on a variety of other benchmarks also.

Also noteworthy is its performance in the NVIDIA Grace-Hopper combo (based on Neoverse V2). NVIDIA shared real hardware data with Arm on performance versus Intel Sapphire Rapids and AMD Genoa. In raw performance the Grace CPU was mostly at par with AMD and generally faster than Sapphire Rapids by 30-40%.

Most striking for me was their calculation for a datacenter limited to 5MW, important because all datacenters are ultimately power limited. In this case Grace bested AMD in performance by between 70% and 150% and was far ahead of Intel.

Net value

First on Neoverse’s contribution to Grace-Hopper – wow. That system is at the center of the tech universe right now, thanks to AI in general and large language models in particular. This is an incredible reference. Second, while I’m sure that Intel and AMD can deliver better peak performance than Arm-based systems, and Grace-Hopper workloads are somewhat specialized, (a) most workloads don’t need high end performance and (b) AI is getting into everything now. It is becoming increasingly difficult to make a case that, for cost and sustainability over a complete datacenter, Arm-based systems shouldn’t play a much bigger role especially as expense budgets tighten.

For CSS-N2, based on their own analysis Arm estimates up to 80 engineering years of effort required to develop the CSS N2 level of integration, a number that existing customers confirm is in the right ballpark. In an engineer-constrained environment, this is 80 engineering years they can drop from their program cost and schedule without compromising whatever secret differentiation the want to add around the compute core.

These look like very logical next steps for Arm in their Neoverse product line. Faster performance in the V-series and let customers take advantage of Arm’s own experience and expertise in building N2-based compute systems, while leaving open lots of room for adding their own special sauce. You can read the press release HERE.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.

Instance

Array
(
    [node_name] => Arm
    [node_id] => Array
        (
            [0] => 2
        )

)

Instance

Array
(
    [node_name] => 
    [node_id] => Array
        (
            [0] => 2
        )

    [title] => Recent Forum Threads
)

Threads

Search Semiwiki

Recent Arm Articles

Accelerating Custom Silicon

Neoverse V2

Net value

Comments

Sponsor

Recent Forum Threads