Apple’s recent bout with ‘Batterygate’ highlighted just how important dynamic power management can be. Our last Sonics update looked at using their NoC to manage power islands; this time, we look at their research progress on architectural measures for power management.
Before we start, two points. Blogger Robert Maire and our astute readers commented extensively about how the endgame for Batterygate should be differences in leakage characteristics between the Samsung and TSMC processes used on the A9 variants. I disagree somewhat. I’d really like to see the test scenarios re-run with the current update of iOS9.3 (just released) applied and see if my hypothesis was correct and someone mucked up power management in software.
Leakage and its tradeoff versus speed is a competitive weapon between foundries, but it will not solve the power management challenge all by itself. That’s a point our own Paul McLellan made a few months ago introducing this idea, which was in Sonics’ R&D labs at the time. If you read Paul’s piece, I’ll take a different angle and perhaps cross some of the road he covered.
Consensus holds that once a low-leakage process is selected, architectural techniques deliver a much bigger bang for the investment in system level power reduction. My position in the Batterygate threads was the two implementations of the A9 are not the same chip. They started at the same RTL and are functionally equivalent, but after differences in cell libraries and mapping and what we’re about to discuss, they may be very different in terms of power management behavior.
“Dynamic power management” (dynamically managing power, not managing dynamic power) is not a single technique, but rather a wide range of choices. Each has implications in real estate and power saving benefits. For example, in a fine grained clock gating approach, not building a recirculating multiplexer can save about 10% of the area used for the gating. Sequential clock gating looks at the output of a block and decides if it has any impact on the state of the blocks it feeds. At the other end of the spectrum is adaptive voltage and frequency scaling. Where DVFS is dead reckoning as Drew Wingard described it, adaptive VFS looks at variables like junction temperature to decide if a bit more or a bit less scaling is needed.
However, the biggest single difference may be what happens when you want to turn something that is off back on. Wingard’s punchline here should send chills up the spines of designers: “If you relied on the wrong technique, by the time you know you want something back on, there’s not enough time left to do it.” Adding transition latency to the table illustrates narrow and fast versus wide and slow.
Notice this table spans 6 orders of magnitude in response time. There are two other aspects to consider. For most of these techniques, the processor (or at least a processor, maybe there is a smaller core doing power management) is driving the transition – which bumps the power up briefly. Managing power consumes power and latencies add up fast, so the idea is to get power management tasks done as quickly as possible and get back to the ‘on’ state. The other overlooked but inescapable factor is analog settling time. If too many things are turned back on all at once, the power supply can actually droop causing a bunch of problems.
What’s the fix? Moving some of the power management tasks into hardware can help. This is an updated chart with current figures comparing an OS like Android handling it at the application processor level, putting it on a dedicated microcontroller core in software, and handling it in simpler hardware state machines. The advantage of the state machine approach is it can be replicated and distributed and tuned, applying different techniques where needed.
Next is an eye chart for sure, but one most hardware teams have never seen – since it is the software guys who usually have to deal with it. There are a lot of spots where settling time figures in prominently. The message here is that if you have all day in a real-time context, a technique like adaptive VFS can save a lot of power, but it takes a lot of careful sequencing in software. When you get it wrong and miss a step somewhere, you end up with Batterygate.
By using hardware state machines to optimize entire sequences and provide significant savings, Sonics is looking to help designers capitalize on the opportunity to save power and not leave it to software that might miss something. What I’m anticipating when they release the ICE-Grain Power Architecture is a comprehensive tool to aid in insertion of SMs and visualization of the sequences, but we have not seen the product just yet. They signed their lead customer last month and are prepping for a full release soon – we will know more by DAC 2016, I suspect.
More articles from Don…