WP_Term Object
(
    [term_id] => 497
    [name] => Arteris
    [slug] => arteris
    [term_group] => 0
    [term_taxonomy_id] => 497
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 140
    [filter] => raw
    [cat_ID] => 497
    [category_count] => 140
    [category_description] => 
    [cat_name] => Arteris
    [category_nicename] => arteris
    [category_parent] => 178
)
            
Arteris logo bk org rgb
WP_Term Object
(
    [term_id] => 497
    [name] => Arteris
    [slug] => arteris
    [term_group] => 0
    [term_taxonomy_id] => 497
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 140
    [filter] => raw
    [cat_ID] => 497
    [category_count] => 140
    [category_description] => 
    [cat_name] => Arteris
    [category_nicename] => arteris
    [category_parent] => 178
)

Moderating Our Open Chiplet Enthusiasm. A NoC Perspective

Moderating Our Open Chiplet Enthusiasm. A NoC Perspective
by Bernard Murphy on 02-14-2024 at 6:00 am

I recently talked with Frank Schirrmeister (Solutions & Business Development, Arteris) on the state of progress to the open chiplet ideal. You know – where a multi-die system in package can be assembled with UCIe (or other) connections seamlessly connecting data flows between dies. If artificial general intelligence and industrial-scale quantum computing are right around the corner, surely any remaining issues in open chiplet design should be a snap to resolve? According to Frank, the answer is yes and no. For a couple of privileged groups, anything is possible and is being put into practice today. For larger open markets, not so much, at least not in the near term.

Moderating Open Chiplet Enthusiasm

Courtesy of Arteris

Multi-die systems and proprietary solutions

Multi-die systems address the never-ending demand to build bigger and more complex systems (for LLM processing as one example) when constrained by a number of semiconductor limitations: you can only fit so much logic on one die; some functions like analog and DRAM work best in processes which are not optimal for logic; and even if you could somehow fit more onto a single die, yield would plummet and costs would soar.

Within the last year or so, Intel, AMD, and Nvidia all released processor products based on chiplet architectures. What is unique to these products in this context is that these companies each built all their own chiplets, together with the infrastructure and connectivity assembling them into a full multi-die system. They have no dependency on external chiplet providers or external chiplet-to-chiplet communication IP providers. By controlling everything internally, and guiding their suppliers accordingly, they can tune and validate the systems they built in-house against their own extensive suites of tests. Some other very large vertically integrated companies may also fall in this class. I am told that Meta may now be one of these, and I would be surprised if Apple was not also handling all their own multi-die design.

For anyone else wanting to build a multi-die system, this is all interesting but still amounts to a proof of concept. Works very well for Intel, AMD, and Nvidia but more is needed for systems builders who don’t have that level of control. While UCIe (among other options) should, in principle, take care of die-to-die communication, reality suggests the challenge is not yet conquered.

By the way, there is also a parallel trend; Printed Circuit Boards (PCBs) are getting smaller. Here the industry has seen many different types of packaging approaches, and users are used to integrate multiple dies on substrates for designs that don’t challenge the reticle limit mentioned above. Both trends converge on chiplets, albeit with different design methodology approaches – miniaturized PCBs vs. co-designed or interoperable bare pieces of silicon.

Open chiplets and interoperable communications interfaces

In theory, using standards like UCIe for inter-die communication should resolve communication problems between die, essential to enable a true open chiplet ecosystem. If this works as advertised, then chiplets should be able to communicate even if they come from different chiplet vendors, are built in different foundries, etc. Unfortunately, compliance with the standard is proving a necessary but insufficient condition to ensure interoperability between two sides of (say) a UCIe link. While the PHYs can be checked via eye-diagrams, there is still variability in ways to pack data from protocols like AXI and CHI to streaming interfaces like CXS and from there to FDI, UCIe’s streaming interface.

This is not a revelation. In the PC world, wired and wireless communications and other domains, standards compliance is step 1. Plugfests to prove real-world interoperability between vendors is a next step. For cellular communications, network operators require detailed interoperability testing against their requirements. It seems a similar infrastructure is needed for chiplet communications, although that may be a little more challenging because you can’t plug a connector into a chiplet. Frank tells me he hears plans are in the works but are not expected to become mainstream any time soon (it took PCIe a while too). The industry has announced early cases of just UCIe interoperability, between Intel and Synopsys for instance.

One class of systems builders has a simple answer to this problem. They are powerful enough to force their suppliers into converging compliance on their design. If something isn’t working in their use cases, the potentially guilty parties dig down and must come up with a resolution. Some big automotive OEMs are in this class, also some big HPC enterprises. Problems found here are likely to be small differences in expectations for margins, buffering, and other parameters not fully nailed down by the standard. Or just bugs not covered in chiplet/IP vendor use-case testing. Whatever the problem, the suppliers must sort it out. It’s good to be king when you want to build a chiplet-based design.

For everyone else

Getting to interoperability today depends on where each of your inter-die connections falls in the big and constantly evolving matrix of proven/covered communication pairs considering IP/PHY sources, specification differences, and use-case differences (coherent versus non-coherent links). Symmetric pairs (everything the same on both sides) should (?) be fine, but asymmetric pairs are a gamble unless proven in production. According to Frank, this challenge is especially visible from the NoC world. He says customers ask if the Arteris NoC works with a particular UCIe Controller IP. Reasonable question you would think.

But the NoC talks to a protocol to stream converter, which then talks to a PHY. That communicates through a link to a PHY on the second chiplet, then to a stream to protocol converter, then to the NoC on that chiplet. Everyone is fully compliant with the standard, but still the link doesn’t work – unless it has been proven to work in production. Much tighter interoperability testing will eventually solve this problem, but that may be 5 years out. In the meantime, Arteris and customers are filling in cells in the interoperability matrix one (or maybe a few) at a time.

Bottom line, chiplets are real, totally under control for the vertically integrated system builder, evolving rapidly under autocratic customers, and inching forward for everyone else. You can read more HERE.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.