Safety is a big deal these days, not only in automotive applications, but also in critical infrastructure and industrial applications (the power grid, nuclear reactors and spacecraft, to name just a few compelling examples). We generally understand that functional blocks like CPUs and GPUs have to be safe, but what about the interconnect? To some among us, interconnect just means wires; what’s the big deal, outside of manufacturing defects and electromigration? No joke – Kurt Shuler (VP Marketing at Arteris IP) tells me he still struggles with this perception among some people he talks to (maybe physical design teams and small block designers?).
If you’re familiar with SoC architecture, you know why interconnect is just as important as endpoint IP in functional verification, safety, security and other domains. Just as the Internet is only possible because connections are virtualized by routing through networks of traffic managers, so in modern SoCs, at least at the top-level, traffic is managed through one or more networks-on-chip (NoCs, though the implementation is quite different than the Internet). That means there’s a lot of logic in this interconnect IP, managing interfaces between endpoint IPs (such as ARM cores, hardware accelerators, memory controllers and peripherals) and managing those routing networks. Moreover, since the interconnect mediates safety-critical operations between these IPs, it is inextricably linked with system safety assurance.
Completing failure mode effects analysis (FMEA) and failure mode effects and diagnostic analysis (FMEDA) is particularly difficult for interconnect IP, first, because they are safety elements out of context(SEooC, as are all IP and generally chips), which therefore have to be validated against agreed assumptions of use. Secondly, interconnect IP are highly configurable, which makes reaching agreement on the assumptions of use, defining potential failure modes and effects, and validating safety mechanisms even more challenging than for other IP, as safety assurance must be determined on the configuration built by the integrator.
To Arteris IP, the right way to handle this complexity is to start with a qualitative, configurable FMEA analysis, which can then guide the creation of a configuration-specific FMEDA. Naturally, this requires hierarchy around modular components with safety mechanisms tied to those components, so that as you configure the IP, those safety mechanisms are automatically implemented in a manner that ensures safety for the function. For example, in the Arteris IP NoCs, the network interface unit (NIU) responsible for packetization can be duplicated. Then in operation, results from these units are continually compared, looking for variances which would signal a failure. You can also have ECC or parity checks in the transport. Following the FMEA/FMEDA philosophy, these safety mechanisms are designed to counter the various potential failure modes identified by the IP vendor. And because of the modularity, functional safety diagnostic coverage of the entire NoC can be calculated based on individual module coverage metrics. Kurt also stressed that it is important for the IP provider to work with functional safety experts (in Arteris IP’s case, ResilTech) to ensure maximum objectivity in this analysis, and naturally, to suggest opportunities to further enhance safety solutions.
Who guards the guardians? Arteris IP provides BIST functions to check their compare logic for safety mechanisms at reset and during run-time. All of this safety status rolls up into a safety controller (part of the Arteris IP deliverable), which can be monitored at the system level.
A critical part of safety analysis is deciding how to meaningfully distribute failure modes for verification. This is an area where the IP provider (with guidance from an independent safety expert) can generate a template to drive FMEDA estimation based on the specific configuration and estimated area of sub-components. Having a modular and hierarchal interconnect IP architecture is key to calculating these safety metrics required for FMEDA. In other words, the user of the IP should be able to expect a largely self-contained solution for this part of their safety validation task.
Another important advantage for an automatically generated IP when it comes to safety is that it becomes possible to automate the generation of the fault-injection campaign for verification and the rolling up of results to the FMEDA table. Rather than taking a shotgun approach to faulting, the campaign can be more precisely targeted in this manner:
- Define where a fault must be inserted to generate one of the failure modes
- Define traffic patterns that ensure faults don’t appear as safe faults
- Define which safety mechanism shall detect the fault (but ensure all possible safety mechanisms detect the failure mode)
- Define observation points as close as possible to the sub-part being tested
Again, the fault campaign is largely automated for the integrator.
Overall, this seems like a reasonable strategy, meeting requirements for functional safety and the expectation that an IP should still be a self-contained solution, even when configured as extensively as is possible with a NoC. You can learn more about the Arteris IP safety solutions HERE.