WP_Term Object
(
    [term_id] => 57
    [name] => MunEDA
    [slug] => muneda
    [term_group] => 0
    [term_taxonomy_id] => 57
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 43
    [filter] => raw
    [cat_ID] => 57
    [category_count] => 43
    [category_description] => 
    [cat_name] => MunEDA
    [category_nicename] => muneda
    [category_parent] => 157
)
            
Pic800x100 1
WP_Term Object
(
    [term_id] => 57
    [name] => MunEDA
    [slug] => muneda
    [term_group] => 0
    [term_taxonomy_id] => 57
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 43
    [filter] => raw
    [cat_ID] => 57
    [category_count] => 43
    [category_description] => 
    [cat_name] => MunEDA
    [category_nicename] => muneda
    [category_parent] => 157
)

SRAM design analysis and optimization

SRAM design analysis and optimization
by Daniel Payne on 10-24-2023 at 6:00 am

Every year EDA vendor MunEDA hosts a user group meeting where engineers present how they used automation tools to improve their IC designs, and one presentation from Peter Huber of Infineon caught my attention, it was all about SRAM design optimization. Peter has authored papers at IEEE conferences and been issued patents related to SRAM design. The schematic for a six-transistor SRAM cell is shown below:

SRAM design cell min
SRAM bit cell. Source: Wikipedia

During a SRAM read cycle the Word Line (WL) goes active, then the stored bit values get transferred to the Bit Lines (BL), and finally a sense amplifier goes active to read out the differential Bit Lines. The delay between WL and BL is part of the Read Programmable Self-Timing (RPST), and is tuned by the circuit designer.

SRAM memory designers have several challenges to meet while optimizing the circuit design and layout:

  • Minimum operating voltage, Vmin
  • Sensitivity to small transistor geometries
  • Process variation effects
  • Power consumption
  • Layout density
  • Soft error rate

As the power supply value of Vdd is lowered then the SRAM bit cell eventually fails to operate, and that failure can occur during a Read cycle, Write cycle, or just by noise induced from nearby circuits switching. Influences on the memory failures come from how the core and periphery layouts are done, process and local variations, temperature, memory array size, and the yield criteria.

Yield prediction by simulation is challenging because multiple blocks are involved: bit cell and periphery such as sense amplifiers, multiplexers, self-timing circuitry, and so on. The SRAM designer hence faces multiple issue with parametric yield simulation:

  • The interactions between the blocks are relevant: A bit cell whose read current is exceptionally weak due to local variation of Vth may or may not be read correctly depending on the offset of the connected sense amplifier and other periphery, which in turn depends on the local Vth variation in those blocks.
  • Various blocks’ quantities must be considered, for example in an array of 32 sense amplifiers, each of which is connected to 1024 bit cells.
  • High effort of transient simulation of a single bit cell read cycle, because the transient simulation has to include layout parasitic effects of a large part of the circuit with high accuracy.
  • The statistical analysis must be repeated many times to analyze the effect of array size, macro settings for the self-timing, assist and boost circuitry.

Brute force Monte Carlo SPICE simulation of every cell’s read cycle in an extracted post-layout full chip netlist allows to calculate the statistically correct yield estimate but at a prohibitively large simulation effort. In the past, ML surrogate models could be used to guide the sampling but that still has a too large simulation effort for extensive analysis of the effects of SRAM macro settings.

Infineon now introduced a new two-step approach to simulate their SRAM design by using Worst Case Distance (WCD).

The WCD analysis consists of creating a simulation set with the WiCkeD tool for one Vdd and one probability plus sigma combination.  During this process the worst-case value for Read current (Iread) is determined, and then determining the worst-case sense-amp detuning. Finally, one transient simulation is run for each Programmable Self Timing (PST) setting, with back-annotated worst-cases cells from WCD analysis.

Separating the analysis into two steps has the advantage that the detailed statistical analysis of the sub-blocks is done independently in small netlists with short simulation time, whereas only a handful of slow transient runs of the full circuit are necessary to determine whether the combined worst-case blocks pass or fail the read cycle depending on different high-level macro settings (RPST).

In the past, a single combination of worst-case blocks was used for full-chip transient simulation, for example a 6-sigma worst-case bit cell combined with a 4-sigma worst-case sense amplifier. That was fast and sufficient for verification but overly pessimistic. In the new approach, multiple combinations are tested, and each single met point on the equi-probability curve guarantees a minimal total yield, so that one met point is enough to guarantee the yield and accept the supply voltage as working. In this way, the pessimism is eliminated so that simulated failure rates match very well with silicon measurements.

SRAM equi probability curves min
SRAM equi-probability curves

Simulation results produced the following plot where the Vmin value is on the Y-axis, and the read PST setting is on the X-axis.

Vmin simulation results min

Silicon measurements correlated very well with simulations, where the values for Vmin on Read and Write cycles were less than two percent off, as expected due to effects such as IR drop.

Summary

This group at Infineon was able to simulate and optimize Vmin operation values for SRAM designs by using a two-step methodology with the MunEDA WiCkeD tools: WCD plus transient simulation.  Python scripting was used to automate these analysis methods, and that feature is called GangWay.  With scripting they are able to setup and transfer to new memory architectures, reproduce simulation results quickly and transfer the verification task to other engineers.

WEBINAR:  Fast and Accurate High-Sigma Analysis with Worst-Case Points

Related Blogs

 

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.