Years before ISO 26262 (the auto safety standard) existed, a few electronics engineers had to worry about radiation hardening, but not for cars. Their concerns were the same we have today – radiation-induced single event effects (SEE) and single event upsets (SEU). SEEs are root-cause effects – some form of radiation, might be cosmic, might be generated on earth, smacks into a chip die causing an ionization cascade. That may lead to a single event upset (SEU) where a bit in the logic is flipped. SEEs can also trigger latchup, gate rupture and other damage. But most efforts on rad hardening today, that I know of, focus on SEUs.
Two factors amplify the importance of SEUs – radiation flux intensity and the sensitivity of the circuit. Radiation flux at ground level, mostly neutrons triggered by cosmic ray events in the upper atmosphere, wasn’t energetic enough in most applications to be an issue until we got to smaller fabrication geometries where a bit can be flipped by a single ionization event.
Obviously above the atmosphere and higher in the atmosphere, cosmic ray energy and flux is less moderated by traveling through miles of air, which means that satellites and aircraft need a higher level of hardening. Also on the ground, some applications such as the European ITER fusion reactor need to use specially hardened FPGAs. The same applies to instrumentation around nuclear reactors.
Mentor recently released a white paper, “RETHINKING YOUR APPROACH TO RADIATION MITIGATION”, talking about a general methodology towards handling this need, particularly directed to the FPGA design so common in these aerospace and nuclear applications. Interestingly this paper doesn’t push any tools or even classes of tools. It’s one of those happiest finds among vendor white papers – a commercial-free information resource!
The paper starts with a common FPGA development flow for high radiation environments. This should look familiar to ISO 26262 aficionados, with a parallel flow for FMEDA, fault analysis, fault protection and fault verification. I’m thinking we may already be used to a decent level of automation in this flow in the automotive domain. There seems to less of this in aerospace and nuclear or perhaps less for FPGA design in general; maybe since FPGA design methodologies often follow behind those for mainstream SoC design?
Whatever the reason, it looks like designers in these domains depend mostly on expert-driven and largely manual fault analysis. The theme of the paper is to argue the benefits of moving towards a more automated, exhaustive (to some level) and scalable approach which will work not only with in-house designed logic but also with embedded 3rd-party IP.
The paper walks in some detail through the challenges in conventional approaches to fault analysis, through metrics for fault coverage and FIT, and the structural analysis that must be performed to assess these metrics, from low-level logic up to a full design. It also talks about common fault mitigation approaches, parity, CRC, ECC, TMR, duplication and lockstep checking, you know the list.
The next topic is fault protection, with a nod to fail-operational behavior (also becoming more common in ISO 26262). The main emphasis here is on the error-prone nature of manually inserting mitigation techniques and the challenge in re-verifying those changes did not break mission-mode functionality. This implies a need for more automated equivalence checking.
The final section is on fault verification and the challenges in intelligently faulting a sufficient set of nodes to ensure a high level of coverage while keeping that set to a manageable level (since fault simulation is going to burn a lot of compute cycles).
An interesting insight into the needs of the aerospace and nuclear electronics design communities, who should definitely find it a good backgrounder. You can read the paper HERE.