I don’t know about you, but when I think of mission-critical applications, I immediately think of space exploration or military operations. But in today’s world, mission-critical applications are all around us. Think about the cloud and how data is managed, analyzed, and shared to execute any number of tasks that have safety and security implications. Or in home IoT-based applications where security systems or smoke alarms should reliably operate and send alerts when something goes awry. What about your self-driving car? One failure could cause serious damage or fatality. If you look, you’ll find that mission-critical applications exist in every aspect of our lives from travel to medical to energy to manufacturing to connectivity.
SoCs are at the heart of these mission-critical applications so how do we ensure that these SoCs don’t fail in the field? How do we make sure that these designs are resilient against random hardware failures? Systematic failures are often detected and fixed during IC development and verification, but random failures in the field are unexpected and can be difficult to plan against leading to serious implications. Devices need to not only be reliable, and function properly as expected, but also resilient against random failures that can occur. Devices need to be able to either recover from these events or mitigate them.
Devices in the field also need to be built to last. Aging effects can be factored into the reliability of the design during the development phase using models, DFM, test, and simulation. However, random failures must be accounted for during the design phase. Designing in safety mechanisms or safety measures (SMs) is key to ensure mission-critical designs are not affected by random failures such as single event upsets (SEUs) during the lifespan of the device.
Adding SMs, which are generally in the form of redundancy, into a design to protect against SEUs is not a new concept – it has been around for decades. However, this effort has largely been manual. Manually inserting SMs is painstaking and error prone as physical placement constraints and routing considerations need to be accounted for to ensure that these SMs don’t have any adverse cascading effects on elements such as reset, power, or clock network signals.
Synopsys synthesis and implementation tools provide a fully automated approach to inserting the SMs to make mission-critical design much more resilient. Synthesis can automatically insert the elements while the place and route (P&R) tools will take care of the physical implementation challenges such as placement distance and routing independence of signal nets. We have drafted a white paper to describe the process of adding these SMs and analyzing and verifying that they meet requirements from RTL to GDSII. Download the white paper “An Automated Method for Adding Resiliency to Mission-Critical SoC Designs” to learn more.
An Automated Method for Adding Resiliency to Mission-Critical SoC Designs
Adding safety measures to system-on-chip (SoC) designs in the form of radiation-hardened elements or redundancy is essential in making mission-critical applications in the Aerospace and Defense (A&D), cloud, automotive, robotics, medical, and Internet-of-Things (IoT) industries more resilient against random hardware failures that occur. Designing for reliable and resilient functionality does impact semiconductor development where these safety measures have generally been inserted manually by SoC designers. Manual approaches can often lead to errors that cannot be accounted for. Synopsys has created a fully automated implementation flow to insert various types of safety mechanisms, which can result in more reliable and resilient mission-critical SoC designs.
This paper discusses the process of implementing the safety mechanisms/measures (SM) in the design to make them more resilient and analyze their effectiveness from design inception to the final product.