Protecting memory with ECC but leaving the rest of an SoC uncovered is like having a guard dog chained up in the back corner of your yard. If the problem happens to be in that particular spot, it’ll be dealt with, otherwise there will be a lot of barking but little actual protection.
Similarly, adding a safety-capable processor like an ARM Cortex-R or a Synopsys DesignWare EM SEP core in a dual-core lockstep configuration is only part of the answer for protecting SoCs.
In ISO 26262, achieving better than ASIL level B versus single point faults is only possible with unit duplication – but the core is only one type of unit to be considered.
Without end-to-end protection of the cores, memory, peripherals, and interconnect, faults can and will put the entire system at risk. Unit duplication of the cores is a given, but how can the rest of the SoC be protected without just cloning and managing duplicated paths? The answer is implementing network-on-chip.
Arteris recently introduced their FlexNoC Resilience Package IP, an add-on configuration for FlexNoC implementing end-to-end support for safety-critical SoC design. It provides out-of-the-box support for ARM Cortex-R5 and Cortex-R7 processor port checking. Generation and termination of all CPU command redundancy is provided at the CPU AXI interface, supporting data ECC that can be carried through transport.
Packet transport protection is the highlight. With this resilience package, FlexNoC can pass ECC through the NoC using its socket interfaces. For more advanced checking, FlexNoC can generate custom data payload and control ECC in packet-generating units, and detect or correct errors in packet-consuming units. The amount of redundancy is configurable, so the user has control over the size and performance of the implementation to best meet requirements.
Packet checking is also implemented, beyond just ECC protection for data. Framing signal checks can determine if words are deleted or duplicated. Word count checks make sure the actual packet length is consistent with the transaction command being performed. Path checks can sense bad routing, ensuring packets reach their intended destinations. Timeout checks for stuck arbiters and overdue transactions.
Unit duplication is another feature with a lot of attention. This goes beyond just creating multiple paths, but gets at how checkers are implemented. Duplicate logic is delayed by one cycle, using a separate clock input – this overcomes faults due to power glitches and clock branch defects.
Just adding additional checker logic is worthless unless the checkers themselves are fully protected from faults. Checkers are dedicated to duplicated functional units and have a separate clock gater, and are also deeply integrated with built-in self test (BIST). The core can initiate a BIST sequence, providing a simple test of comparison logic or a more detailed assessment of mission faults.
The most sophisticated part of the FlexNoC Resilience Package is the design partitioning capability. Rather than an all-or-nothing approach to protection, the NoC under resilience can firewall sections of a chip, keeping safety-critical functions segregated from non-goaled areas of the device. This is a super feature to reduce system cost and certification cycles, and enable more IP reuse.
A short white paper on the Arteris site describes more on the implementation:
SoC Reliability Features in the FlexNoC Resilience Package
Arteris has targeted automotive applications first fitting ISO 26262 needs, but the FlexNoC Resilience Package has broader application in medical, industrial, and possibly even defense applications. Using an off-the-shelf configurable NoC with complete end-to-end protection is a much faster design path compared to trying to implement all these features in RTL. Additionally, there are strong benefits to software when using a NoC approach. Functional units are abstracted in an addressable communication framework, making software simpler.
Getting to truly safe SoCs requires end-to-end protection, beyond duplicate cores and ECC on memory. The FlexNoC Resilience Package is a significant breakthrough based on listening to customer requirements in safety-critical applications.