I had a debate with Steve Carlson of Cadence earlier in the year at the EDPS conference on whether there were really any truly effective solutions for doing power estimation in emulation. I thought there weren’t and he said I was wrong. After attending the Cadence front-end summit last week, I have to admit he has a point.
First, who cares? Why is power estimation in emulation important? Simple – power varies widely based on activity and many would agree that software load is the most important factor in determining power for a given architecture. The problem is that all standard (non-emulation-based) approaches to determining power are limited to effectively tiny samples of activity, delivering little islands of well-understood power in an ocean of otherwise unknown power behavior. Of course designers and architects work hard to find “representative” cases but this is more margining than science, with all the evils that margining brings. And even then, finding peak power problems has been effectively impossible (finding needles in a haystack) until you get to silicon running real applications. Peak power is very important because it drives temperature spikes and that can lead to system failure or even silicon failure. In fact, analyzing temperature has become so important that P-T-P (performance-temperature-power) is becoming more important that PPA (performance-power-area) in many contexts.
The obvious way to get more realistic windows of activity is through emulation but I thought I saw a problem. Power estimation needs activity data on every node but an emulator becomes very slow if it has to dump all that data; the promised speedup would disappear in data dumping and you still wouldn’t be able run realistic loads. I was wrong. Palladium™ is able to dump just a subset of the nodes (registers) and uses probabilistic modeling through combinational logic to get a reasonable estimate of activity in between. Also, Palladium connects natively through the PHY (physical access interface) to Joules, the Cadence power estimation solution, so all that work in going through an FSDB (or similar) step is avoided. This speeds turnaround time from days to hours on big jobs.
The proof is in real tests. Cadence has demonstrated running the AnTuTu test-suite, a widely-used benchmark to grade Android-based phones on many features. Since this is one of the more comprehensive system tests available today for a smartphone, their ability to run it on an emulation model of the device and produce power and temperature profiles is testament to the practical value of the Palladium + Joules solution.
Of course emulation isn’t all you need to design for and debug power. It provides good (approximate) guidance on power across realistic software loads and can identify peak power windows which need special attention in design. You can then take that into Incisive™ simulation with Joules™ for detailed analysis with increased accuracy in narrow windows (those peak power cases, for example) and then into detailed power and thermal analysis at the implementation level using Voltus™, Sigrity™ and PowerDC™. The whole flow together provides successive refinement from realistic software loads all the way down to final implementation, spanning the full range of factors that influence power and temperature.
So my apologies to Steve – this really is about the best you can do in design, short of trial-and-error on multiple silicon respins. One or two respins may be unavoidable these days, but you need this solution to make sure you keep it to no more than one or two.
You can read more about Palladium and Joules power estimation HERE.