WP_Term Object
(
    [term_id] => 16
    [name] => Moortec
    [slug] => moortec
    [term_group] => 0
    [term_taxonomy_id] => 16
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 15
    [filter] => raw
    [cat_ID] => 16
    [category_count] => 15
    [category_description] => 
    [cat_name] => Moortec
    [category_nicename] => moortec
    [category_parent] => 178
    [is_post] => 1
)

Accuracy of In-Chip Monitoring for Thermal Guard-banding

Accuracy of In-Chip Monitoring for Thermal Guard-banding
by Daniel Payne on 01-28-2019 at 12:00 pm

I remember working at Intel and viewing my first SPICE netlist for a DRAM chip, because there was this temperature statement with a number after it, so being a new college graduate I asked lots of questions, like, “What is that temperature value?”

My co-worker answered, “Oh, that’s the estimated junction temperature of the chip.”

The next question was, “What do you mean estimated, don’t we know what the junction temperature actually is?”

With a slight grimace, the co-worked replied, “Well, we don’t know what the actual junction temperature is until we fabricate it, package it up, place it in the tester, then measure it. So we just estimate the junction temperature based on past experience, then put that number in SPICE.”

My naive engineering bubble had just burst, because I assumed that my professional colleagues in industry knew a lot more about the DRAM chips that they were designing than to simply guess at a temperature for use in SPICE, then hope for the best when silicon came back. Today, however, the IC design landscape has changed quite a bit, even to the point that engineers can place an actual IP block on their chip that will dynamically measure the local temperature in real time, aka In-Chip Monitoring.

Going back to the DRAM example at Intel we first started out packaging the memory device in a rather expensive ceramic package which had excellent thermal properties, but then for cost-savings we would migrate to a cheaper plastic package with poor thermal properties, so knowing the junction temperature made a huge difference in the operation of the DRAM and the profit margin of our product.

Chips are being designed today across a wide range of process nodes from the mature 40nm down to research nodes like 3nm, and at each node you have to keep your chip operating within a safe thermal limit in order to meet power and reliability requirements. Many design segments are limited by thermal considerations for semiconductor devices, like: Datacenter, IoT, consumer and automotive. If you can sense the die temperature and then manage the operation of the chip to keep within thermal limits, you will save power and improve reliability.

Let’s consider using an in-chip thermal monitor where we have a target junction temperature of 85 degrees C. If the temperature sensor accuracy is plus or minus 5C, then our expected temperature range is 80C to 90C. When the worse-case lower point to the temperature of 80 degrees C is reached then the chip could slow down a clock frequency or even reduce the Vdd level to one or more IP blocks. So, within software you may decide to set such actions to be taken at the 80C, to be on the safe-side. However, by setting the software limit to 80C you still need to account for the worse case thermal sensor accuracy. Therefore, within the thermal guard-banding scheme, software thinks 80C is reached but actual junction temperature could be as low as 75C.

22888-thermal-guardband-min.jpg

In comparison what happens if we instead use a temperature sensor with a tighter accuracy of plus or minus 2C?

The good news is that this more accurate temperature sensor has a tighter range of 83C to 87C, and then with guard-banding has a lower limit of 81C. The difference between the first temperature sensor limit and the second one then becomes 81C – 75C = 6C. That 6C difference means a lot, and could be between 5W and 10W of power savings, depending on the architecture.

When talking about a consumer hand-held device running on battery power, that 5W-10W savings means longer battery life, a real benefit. On the other end of the electronics power spectrum like a data center or telecom system this savings would be seen in system energy consumption, speed and data throughput. An automotive benefit is tighter reliability management of the semiconductor device.

An IP supplier based in the UK that focuses on in-chip monitoring of temperature is Moortec, and Stephen Crosher is the CEO who recently made a video on this thermal topic. Stay tuned for a video series from Moortec because they also have IP sensors for Process and Voltage, parts of the PVT troika.

Related Blogs


2 Replies to “Accuracy of In-Chip Monitoring for Thermal Guard-banding”

You must register or log in to view/post comments.