Synopsys PrimeTime PX, popularly known as PT-PX, is widely recognized as the gold standard for power signoff. Calculation is based on a final gate-level netlist reflecting final gate selections and either approximate interconnect parasitics or final parasitics based on the post-layout netlist. The only way to get more accurate power values is to measure the real thing on silicon after fabrication.
By nature, this kind of analysis starts very late in the design flow because you need a near-implementation or post-implementation netlist, and takes quite a long time to perform because you must run gate-level simulations to generate activity data, which can take days to weeks to generate. When signoff is a final confirmation that power is indeed in spec this is OK, but cycle times like this are definitely not OK if you find you missed the power budget. Short of planning for another spin, options until now were limited. You could go back to RTL to fix the microarchitecture, where you can use SpyGlass Power, a great tool for approximate estimation and optimization earlier in the design flow but implying an implementation restart which would delay tapeout significantly.
What you really need here is an intermediate solution between early RTL estimation and final PT-PX signoff accuracy, something that is still very accurate and based on gate-level netlists, but which you can get to much more quickly. This would enable earlier checks at near-signoff accuracy, allowing time for less disruptive corrective actions where needed. This is what Synopsys PowerReplay (a separate product) can offer, together with PT-PX. Synopsys launched this solution in May of this year; a webinar presented by Vaishnav Gorur (PMM) and Chun Chan (R&D director) provides details.
PowerReplay works together with PT-PX, which still does the power estimation based on the same pre- or post-layout netlist, together with SDF if available. What PowerReplay provides in this flow is the ability to short-circuit all the gate-level simulation setup and a good deal of the simulation run-time, while still generating the activity data you need. It does this by starting from an available RTL-based FSDB, from which it auto-maps the stimulus onto the gate-level netlist. The mapping is improved if the SVF file from synthesis is supplied as an additional input. This results in more accurate power numbers downstream.
You can also do activity analysis in PowerReplay to narrow down time windows you want to use in power estimation. While highest activity doesn’t necessarily imply highest power, high activity along with some knowledge of the design should help you localize best windows for worst-case power. In addition you can localize analysis to look only at certain blocks. And, as you might expect, you can run these analyses in parallel. PowerReplay runs simulation on the gate-level netlist using the stimulus from the RTL FSDB, restricting simulation to your selected time windows and design scope. Put this all together and you’ve gone from a long, grinding gate-level simulation and power estimation starting from time 0 to a much faster turn-time analysis requiring minimal setup and delivering almost the same accuracy.
Chun talked about a couple of customer case studies. In one case, the customer compared the PowerReplay flow with their existing signoff flow. They found that within the windows they selected for analysis, the PowerReplay flow results were with 2% of those for the reference flow. Also, where the reference flow took 7 days to complete, the PowerReplay-based analysis completed in 8 hours. In a second customer study, there was again a big reduction in run time thanks to the parallel analysis flow, and accuracy was within 2.5% of the reference flow. Across multiple customers Vaishnav said they have seen accuracy within 5% of PT-PX signoff numbers.
A couple of interesting questions came up in the Q&A. One was whether PowerReplay sims take gate delays into account. The answer is yes, as long as you supply SDF. Taking this into account is important for accurate peak power analysis which would otherwise be skewed. Another good question was how much earlier in the flow customers had been able to run these analyses. Vaishnav said that this flow can be run on blocks, so you don’t have to wait for the full chip, which means that you can start getting accurate block-estimates typically weeks to months ahead of full-chip analysis.
You can replay the webinar HERE.Share this post via: