About 6 months ago, ANSYS was approached by a couple of leading mobile platform vendors/suppliers with a challenging problem. These companies were hitting target 2.5GHz performance goals on their (N10 or N7) application processors, but getting about 10% lower yield than expected, which they attributed to performance failures. Speed grades aren’t an option in this market (when were you last offered the “low-speed version” of a phone?), so the yield problem had to be fixed but they didn’t know where to start. Remember these are very experienced SoC design teams who nevertheless were unable to find in their STA analyses any obvious root cause for these failures.
REGISTER HERE for an ANSYS webinar on variability-aware timing closure, April 18[SUP]th[/SUP] at 8am PDT
According to Ankur Gupta (director of field apps), ANSYS were asked to run blind tests to figure out potential root causes, which those customers would then correlate back to observed failures. This is a timing problem, so think first about the vanilla approach to timing – run STA across however many hundreds of corners, using separately extracted margins for static and dynamic voltage drops (DvD) in the power grid, designed for a static timing target. It’s unlikely that there would be anything wrong with the STA and corners part of the problem given the relationship and influence these design houses have with the foundries and tool vendors. So DvDs have to be a prime suspect.
Now I’m going to switch to a presentation João Geada (chief technologist at ANSYS) gave at the Tau 2018 conference, then come back to the results from the analysis above. João opened with a seemingly obvious but under-appreciated point for advanced designs and technologies – the power supply is neither static nor uniform (within reasonable bounds) across the design, particularly in FinFET designs with large capacitive coupling between power and signals (as a result he describes power integrity now as much more like a signal integrity problem). Between ground-bounce and IR-drop, effective voltage across gates can be squeezed by 100s of mV against a 0.7-0.8V operating voltage, and this can have a huge impact on timing.
Where and when this happens is clearly both instance- and use-case-dependent; attempting to capture this effect in global margins is going to be both very expensive in area (João noted that 30%+ of metal is devoted to power in small geometry designs) and potentially risky in missing critical cases (since detailed DvD analysis is practically limited to a small set of vectors). The first step in addressing these problems is through RedHawk-SC elastic-compute, on which I have written multiple times. Through big data methods, analysis of very large vector sets can be farmed out across a wide range of compute-servers, from big to small, to look for potentially interesting corner cases. This analysis runs fast enough that power distribution design can run iteratively in the implementation/timing flow. By finding the corners to analyze in detail, localizing where they have impact, and refining power distribution surgically and iteratively, you get a higher confidence signoff with lower area (~9% reduction in one case).
But wait, there’s more. Remember the voltage squeeze and how close you are already operating to threshold voltages? Timing this confidently for paths close to critical requires Spice-level accuracy; real delays at this level can no longer be captured simply through delay addition and lookups across corners (varying VSS-VDD ranges as you move along a path make this even uglier). ANSYS FX path analysis can run first in a fast mode, as fast as STA, to identify suspect paths requiring closer analysis and then can run at accuracy within a few percent of Spice (but much faster) to do detailed path delay analysis based on both the circuitry and the PG dynamic behavior. ANSYS find this is quite likely to reorder STA path criticality, suggesting further surgical refinements to the power distribution plan, upsizing where this more accurate analysis suggests paths are more critical than you thought, downsizing where you have more slack.
Back to the big mobile vendors. ANSYS followed this strategy in those blind tests against 100-200K of the (supposedly) positive slack paths, mining DVD drops and ground bounces along paths and timing with these compressed voltages. You won’t be surprised to hear that their reordering of path criticality corresponded nicely with what customers had observed in lab testing, or that those customers are now very interested in this methodology. Ankur closed by noting ANSYS have subsequently been approached by all the rest of the contenders in the mobile benchmark race, which must say something about the credibility of their solution.
So if you thought all this elastic-compute and Spice-level path timing stuff from ANSYS was just marketing hype, think again. If all the significant mobile platform vendors in the world are forming a line at their door, this integrated DvD/STA approach looks like it might be the future of high-performance timing and power signoff.
Remember to REGISTER HERE for an ANSYS webinar on variability-aware timing closure, April 18[SUP]th[/SUP] at 8am PDT
Share this post via:
Comments
One Reply to “A New Problem for High-Performance Mobile”
You must register or log in to view/post comments.