I was at Arm TechCon as usual this year and one of the first panels I covered was close to the kickoff, hosted by Andrew Hopkins (Dir System Technology at Arm), Kurt Shuler (VP marketing at Arteris IP) and Jens Benndorf (Managing Dir and COO at Dream Chip Technologies). The topic was implementing ISO 26262-compliant AI SoCs with Arm and Arteris IP, highly relevant since more and more of this class of SoC are appearing in cars. One thing that really stood out for me was the value of platform-based design in this area, something you might think would be old news for SoC design but which introduces some new considerations when safety becomes important.
A key aspect of platform-based design is being able to combine IP from multiple sources with differing levels of compliance to certain expectations, notably safety in this case. This can be most noticeable when you want to design part of the architecture to an ASIL D (safety critical) level while having enough safety diagnostic coverage to achieve ASIL B or C capabilities in other parts of the design. Designing an IP to this level entails a lot of overhead which may not be justified for safety-nominal (ASIL A/B) or even safety-indifferent (QM) components that you may want to use in your design.
How then can you get your SoC to higher ASIL compliance? The answer lies in being able to ensure that safety-nominal systems cannot corrupt safety-critical functions and can be tested or taken offline if they malfunction. This is all significantly mediated by the network between the IPs, as indicated in the figure above. Among other functions this requires health monitoring for all components and of course reporting faults to a safety controller which can channel problems upstream to decision-making functions (are we in trouble, should the driver grab the wheel, pay special attention, pull the car over to the side of the road?).
Monitoring functions (all provided through or together with the interconnect) include time-out checks for data requests, IP isolation through powering down the appropriate NoC socket connection to run live LBIST checks (at suitable times), and finally end-to-end ECC error detection/correction.
This ability to monitor, check and isolate faulty IP provides the means to ensure ASIL B,C or D compliance at the system level, but depends also on a “cannot-fail” subsystem called a safety island. This is a special function designed fully to ASIL D requirements, with lockstep CPUs, independent memories and run-time testable cache and many more mechanisms to ensure independence from the rest of the system. This safety island continuously monitors for faults and will report (at presumably programed levels of concern) to higher-level decision-making functions in the car.
Closing the loop, Jens talked about a reference platform design they have built at Dream Chip using these capabilities and how that has been spun into several production derivatives. The reference design is based on a quadcore A53, an ISP and vision processor, peripherals and memory interfaces, all connected through an Arteris IP Resilient NoC, together with the safety island. They have a cool demo of this functioning in an autonomous car according to Kurt.
Derivatives modify this platform with different numbers of CPUs in the cluster and different IP subsystems for the vision processor (GPU, NPU or a simpler processor) for active mirror replacements, front-camera and radar applications. In a pre-safety platform, spinning these derivatives would be no big deal. For systems requiring a higher ASIL (B, C, or D), it is a big deal and what makes it possible is this safety modularity around functions, the ability to monitor, isolate and ECC check through the interconnect and a carefully isolated safety island. All of these guarantee higher ASIL operation no matter what else in the SoC may go wrong.
You can learn more about this design by downloading the Arm TechCon presentation HERE.