WP_Term Object
(
    [term_id] => 159
    [name] => Siemens EDA
    [slug] => siemens-eda
    [term_group] => 0
    [term_taxonomy_id] => 159
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 744
    [filter] => raw
    [cat_ID] => 159
    [category_count] => 744
    [category_description] => 
    [cat_name] => Siemens EDA
    [category_nicename] => siemens-eda
    [category_parent] => 157
)
            
Q2FY24TessentAI 800X100
WP_Term Object
(
    [term_id] => 159
    [name] => Siemens EDA
    [slug] => siemens-eda
    [term_group] => 0
    [term_taxonomy_id] => 159
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 744
    [filter] => raw
    [cat_ID] => 159
    [category_count] => 744
    [category_description] => 
    [cat_name] => Siemens EDA
    [category_nicename] => siemens-eda
    [category_parent] => 157
)

Analysis and Verification of Single Event Upset Mitigation

Analysis and Verification of Single Event Upset Mitigation
by Jacob Wiltgen on 12-07-2023 at 10:00 am

The evolution of space-based applications continues to drive innovation across government and private entities. The new demands for advanced capabilities and feature sets have a direct impact on the underlying hardware, driving companies to migrate to smaller geometries to deliver the required performance, area, and power benefits.

Simultaneously, the application space is evolving, and mission parameters for these new applications are causing companies to evaluate non-traditional approaches. Commercial high-reliability processes (i.e., those developed for automotive designs) are being considered for aerospace as they meet both the survivability requirements of certain scenarios and provide reduced development timelines and cost.

Unfortunately, the advantages delivered in lower geometries come at a cost, and one of those drawbacks is that the underlying hardware is more susceptible to soft errors, commonly referred to as single event upsets (SEU). Traditional approaches of redundancy or triplication on salient (if not all) functions within the chip are quickly becoming cost prohibitive.

Fortunately, new flows and automation provide project teams insights into SEU mitigation and offer the ability to optimize the SEU mitigation architecture, also referred to as selective hardening.

Figure 1 Driving trends
Figure 1. Driving trends to selective radiation mitigation

First, let’s review the challenges.

Selective Hardening Challenges

Feedback from the aerospace industry suggests that the traditional approach to SEU mitigation has many pitfalls and leaves two important questions unanswered.

  1. For the design elements known to be mission critical, how effective is the implemented mitigation?
  2. How can I identify the potential of failure due to faults in design elements not protected?

The traditional approach to SEU mitigation is best summarized in a three-step workflow.

  • Step 1: Identify failure points through expert driven analysis
  • Step 2: Design engineers insert the mitigation (HW and/or SW)
  • Step 3: Verify the effectiveness of the mitigation
    • Simulation leveraging functional regressions and force commands to inject SEUs
    • Post-silicon functional testing under heavy ion exposure
Figure 2 old workflow
Figure 2: The traditional approach to SEU mitigation

Unfortunately, the traditional approach has multiple drawbacks, including:

  • No common measurement (metric) which determines the effectiveness of SEU mitigation.
  • Expert driven analysis is not repeatable or scalable as complexity rises.
  • Manually forcing faults in functional simulation requires substantial engineering effort.
  • An inability to analyze the complete fault state space using functional simulation and force statements.
  • Late cycle identification of failures when testing in a beam environment alongside limited debug visibility when they occur.
Automation and Workflows Supporting Selective Hardening

The overarching objective of selective hardening is to protect design functions which are critical to mission function and save on cost (power and area) by leaving non-critical functions unprotected. Boiling that down a level, the methodology has three aims:

  1. Provide confidence early in the design cycle that the mitigation is optimal.
  2. Provide empirical evidence that what is left unprotected cannot result in abnormal behavior.
  3. Deliver a quantitative assessment detailing the effectiveness of the implemented mitigation.

Siemens has developed a methodology and integrated workflow to deliver a systematic approach in measuring the effectiveness of existing mitigation as well as determining the criticality of unprotected logic. The workflow is broken up into four phases.

Figure 3 mitigation flow
Figure 3. The Siemens SEU mitigation workflow

Structural Partitioning: The first step in the flow leverages structural analysis engines to evaluate design functions in combination with the implemented hardware mitigation protecting the function. The output of structural partitioning is a report indicating the effectiveness of the existing hardware mitigation as well as insights into the gaps which exist.

Fault Injection Analysis: Mitigation which could not be verified structurally are candidates for fault injection. In this phase, SEUs are injected, propagated, and the impact evaluated. The output of fault injection analysis is a fault classification report listing which faults were detected by hardware or software mitigation and which faults were not detected.

Propagation Analysis: The SEU sites left unprotected are evaluated structurally under expected workload stimulus to determine per site criticality and its probability to result in functional failure. The output of propagation analysis is a list of currently unprotected faults which were identified to impact functional behavior.

Metrics Computation: Data from structural, injection, and propagation analysis feed the metrics computation engine and visualization cockpit. The cockpit provides visual insights into failure rate, the effectiveness of the mitigation, and any gaps that exist.

Every semiconductor development program has unique characteristics. The methodology described above is flexible and highly configurable, allowing project teams to adjust as needed.

Conclusion

Mitigation of single event upsets continues to challenge even the most veteran project teams, and this challenge is exacerbated as design complexity rises and technology nodes shrink. New methodologies exist to provide quantitative results detailing the effectiveness of SEU mitigation.

For a more detailed view of the Siemens SEU methodology and the challenges it will help you overcome, please refer to the white paper, Selective radiation mitigation for integrated circuits, which can also be accessed at Verification Academy: Selective Radiation Mitigation.

Jacob Wiltgen is the Functional Safety Solutions Manager for Siemens EDA. Jacob is responsible for defining and aligning functional safety technologies across the portfolio of IC Verification Solutions. He holds a Bachelor of Science degree in Electrical and Computer Engineering from the University of Colorado Boulder. Prior to Mentor, Jacob has held various design, verification, and leadership roles performing IC and SoC development at Xilinx, Micron, and Broadcom.

Also Read:

Siemens Digital Industries Software Collaborates with AWS and Arm To Deliver an Automotive Digital Twin

Handling metastability during Clock Domain Crossing (CDC)

Uniquely Understanding Challenges of Chip Design and Verification

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.