Analyzing the operation of a modern SoC, especially analyzing its power distribution network (PDN) is getting more and more complex. Today’s SoCs no longer operate on a continuous basis, instead functional blocks on the IC are only powered up to execute the operation that is required and then they go into a standby mode, perhaps not clocked and perhaps powered down completely. Of course this on-demand power makes a major impact in reducing the standby power.
The development of CPF and UPF over the last few years has had a major impact. Low-power techniques including power-gating, clock-gating, voltage and frequency scaling are represented in the CPF/UPF and are verified for proper implementation. But the electrical verification tools to simulate these complex behaviors in time are still evolving.
The problem is that outdated verification techniques are inadequate. Static voltage drop simulations or simple dynamic voltage drop simulations on the PDN will not adequately represent the complex switching or power transitions as blocks come on- and off-line. The state transitions need to be checked rigorously to guarantee that power supply noise does not affect functionality.
The variations in power as cores, peripherals, I/Os and IP go into different states can be huge, with large current inrushes. Identifying these critical transitions and using them for electrical simulation of the PDN is critical. The transitions can be identified at the RTL level where millions of cycles can be processed and the relatively few cycles of interest can be selected for full electrical simulation.
Another area requiring detailed simulation is voltage islands that either power down or operate at a lower voltage (and typically frequency). This can have a major impact on leakage power as well as dynamic power. But reliability verification is complex. Failures can happen if signal transitions occur before the voltage levels o these islands reach the full-rail voltage. There is a major tradeoff in that if the islands are powered up too fast the inrush current can cause errors in neighboring blocks or in the voltage regulator. If it is powered up too slowly, then signals can start to arrive before the block is ready. To further complicate things, the inrush currents interact with the package and board parasitics too.
As SoCs have become more and more complete, previously separate analog chips such as those for GPS or RF are now on the same silicon as high-speed digital cores. Sensitive analog circuits need to be analyzed in the context of noisy digital neighbors to check the impact of substrate coupling. Isolation techniques don’t always work well at all frequencies and, of course, this needs to be analyzed in detail.
3D techniques, such as wide-IO memory on top of logic, add a further level of complexity. These approaches have the potential to make major increases in performance and reductions in power but they also create power-thermal interactions which need to be analyzed. Not just that the heat on one die can affect its performance, but head from a neighboring die can do so too.
IC designers can no longer just assume things are correct by construction in this climate. Multi-physics simulations of thermal, mechanical, and electrical behavior is a must for reliability verification.
See Arvind’s blog on the subject here.Share this post via:
0 Replies to “Designing for Reliability”
You must register or log in to view/post comments.