Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/intel%E2%80%99s-18a-rumors-meet-a-thermal-brick-wall-says-semiwiki.24458/page-3
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2030970
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

Intel’s 18A rumors meet a thermal brick wall says SemiWiki

IMEC semiconductor roadmap to 2039 assumes BSPD from 2025 on.

Did IMEC not consider the self-heating issue?
You should know that imec in itself is not a company that commercializes chip manufacturing; I've been there in the lithography department. It's a research institute that does research on the building blocks for scaling the microelectronics manufacturing. It would surprise me if they don't have research on new ways of cooling chips.
In the end it is up the IDMs and foundries to put together the building block in a commercial offering or product. It seems in the last 10 years TSMC has been better at this than anyone else.
 
And TSMC is already a long time partner in the big imec research programs while Intel has for the longest time thought they could do everything on their own...
 
And it's all getting gradually worse and worse as processes evolve, because power/current density per mm2 is gradually increasing as more funtionality clocked faster is squeezed into a smaller and smaller area, FinFET had worse SHE than planar, GAA is worse again -- and BSPD throws more fuel onto the fire since this increases both density and clock speeds, as well as having the extra thermal resistance to the heatsink and poorer heat-spreading which makes hotspots worse.

Is there any future possibility of running certain circuits without BSPD and the rest of the chip with BSPD to try to get better cooling in critical spots, or is that not possible due to yields, design rules, etc.?

For example, on 18A -- SRAM cells do not have BSPD because they would actually lose about 10% density. (I've also read claims 14A makes some changes to how BSPD is connected to the transistors that should allow more scaling with BSPD). In the 18A SRAM case, would there be some 'filler' material in place of the PowerVIAs that allows better heat conduction?
 
Is there any future possibility of running certain circuits without BSPD and the rest of the chip with BSPD to try to get better cooling in critical spots, or is that not possible due to yields, design rules, etc.?

For example, on 18A -- SRAM cells do not have BSPD because they would actually lose about 10% density. (I've also read claims 14A makes some changes to how BSPD is connected to the transistors that should allow more scaling with BSPD). In the 18A SRAM case, would there be some 'filler' material in place of the PowerVIAs that allows better heat conduction?
I'd have thought it's pretty much impossible, because the BSPD and FSPD regions have very different metal/substrate topography and the process flow is also different -- also FSPD has the power and I/O bumps on the topside and BSPD has them on the backside. This also makes improved cooling impossible since the chips are the other way up -- FSPD has the substrate at the top (good cooling) with all metal underneath the transistors and then bumps, BSPD has the thin metal/dielectric layers at the top (bad cooling), then the transistors and thin (<1um) substrate, then thick metal and bumps at the bottom. The two are fundamentally incompatible.

Having filler instead of powerVIAs in the SRAM region doesn't help with cooling, it's the wrong side of the chip (bump side not heatsink side). Having said that, SRAM is probably not such a problem for cooling anyway, at least in the memory array area, since power/current density here is very low.
 
Pretty much word-for-word what I've been saying -- their detailed simulations predicted maximum on-chip temperature 23C hotter for BSPDN (1um to 3um square areas, Tj=80C instead of 57C -- but no details about heatsinking assumptions...) which is *very* close to the handwaving estimate of 20C for these sizes I gave earlier in the thread... :-)

All of which justifies TSMCs recommendation of BSPD for "HPC-class devices with dense power grids and active cooling", and shows the problem with using Intel 18A/14A for ASICs which don't fall into this category -- which is to say, most of them (by number of designs)... ;-)
 
Last edited:
Pretty much word-for-word what I've been saying -- their detailed simulations predicted maximum on-chip temperature 23C hotter for BSPDN (1um to 3um square areas, Tj=80C instead of 57C -- but no details about heatsinking assumptions...) which is *very* close to the handwaving estimate of 20C for these sizes I gave earlier in the thread... :-)

All of which justifies TSMCs recommendation of BSPD for "HPC-class devices with dense power grids and active cooling", and shows the problem with using Intel 18A/14A for ASICs which don't fall into this category -- which is to say, most of them (by number of designs)... ;-)
How does Panther Lake being a a highly efficient mobile device with small cooling handle this properly? It seems to be a knock-out hit.

Is it just that the computer chiplet is so small and low power that this issue doesn’t come up?
 
How does Panther Lake being a a highly efficient mobile device with small cooling handle this properly? It seems to be a knock-out hit.

Is it just that the computer chiplet is so small and low power that this issue doesn’t come up?
That's one possible reason, though not likely because it's local power density that matters not chip size.

Another is that it only gets hot for short periods at full load, so the fact that this might reduce lifetime doesn't matter -- or even that nobody will care if it dies after a few years. And let's face it, it's not like Intel have any choice, it's BSPD or nothing for them. Plus the fact that it bumps up maximum clock rate a bit which is what they care about, because -- well, data sheet and benchmarks, hence the "knock-out hit" comment.

Like I keep saying, for chips where the advantages of BSPD outweigh the disadvantages it's the right choice. For all the others -- which today is the majority of foundry chips -- FSPD is the better option.
 
That's one possible reason, though not likely because it's local power density that matters not chip size.

Another is that it only gets hot for short periods at full load, so the fact that this might reduce lifetime doesn't matter -- or even that nobody will care if it dies after a few years. And let's face it, it's not like Intel have any choice, it's BSPD or nothing for them. Plus the fact that it bumps up maximum clock rate a bit which is what they care about, because -- well, data sheet and benchmarks, hence the "knock-out hit" comment.
Well on laptops i can't say that it only boosts for short period of time cause depending on what user do it has to maintain the clock or it looses performance like if user is playing video games i needs to maintain decent clock speed for like longer period of time otherwise you will notice a stutter or low FPS if the clock speed is not stable also some benchmarks are more demanding than the others
 
Well on laptops i can't say that it only boosts for short period of time cause depending on what user do it has to maintain the clock or it looses performance like if user is playing video games i needs to maintain decent clock speed for like longer period of time otherwise you will notice a stutter or low FPS if the clock speed is not stable also some benchmarks are more demanding than the others

To add - in the case of Panther Lake, 18A+BSPD didn't actually raise the max frequency over Lunar Lake or even Meteor Lake - all three (Intel 4, TSMC N3B, Intel 18A) max at 5.1 GHz. However, it looks like 18A+BSPD did pay some benefits on lower idle power and reduced energy usage under load. (Comparing CPU tile vs CPU tile). Real world usage testing confirms fantastic efficiency attributes for PTL, also ahead of AMD's N4 offerings.

(Also context - Raptor Lake refresh U/P parts topped out at 5.4 GHz as a comparison point).
 
Well on laptops i can't say that it only boosts for short period of time cause depending on what user do it has to maintain the clock or it looses performance like if user is playing video games i needs to maintain decent clock speed for like longer period of time otherwise you will notice a stutter or low FPS if the clock speed is not stable also some benchmarks are more demanding than the others
You're misunderstanding what "short periods" means here (for hot-spots) -- it's not how long the "fast/hot/high-voltage" phase lasts (seconds or minutes), it's what percentage of the device lifetime is spent in this condition (because high temperatures and high voltages degrade reliability as well as performance).

Laptop/mobile CPUs spend most of their lifetime doing very little with short bursts of intense activity -- in this sense, an hour once a day (or week) is "short". Server CPUs/NPUs/switches spend most of their time 24/7/365 hammering away processing data, because anything else wastes money -- and also the DC owner had thousands of these, so will see any burnout failures. That's one reason why server CPUs run much slower and at lower voltages than consumer ones...
 
Last edited:
To add - in the case of Panther Lake, 18A+BSPD didn't actually raise the max frequency over Lunar Lake or even Meteor Lake - all three (Intel 4, TSMC N3B, Intel 18A) max at 5.1 GHz. However, it looks like 18A+BSPD did pay some benefits on lower idle power and reduced energy usage under load. (Comparing CPU tile vs CPU tile). Real world usage testing confirms fantastic efficiency attributes for PTL, also ahead of AMD's N4 offerings.

(Also context - Raptor Lake refresh U/P parts topped out at 5.4 GHz as a comparison point).
That may be partly due to the process, in a CPU like this BSPD will give more speed at the same power or lower power at the same speed than FSPD. And if you have a faster process you can always decide whether to increase the clock rate (keep similar gate structure) or reduce power while keeping the same clock rate (smaller gates) -- don't forget that all this also changes with voltage and transistor type, it's a *really* complex optimization problem...

But it can also be down to the CPU design/architecture or power/voltage control strategy, which changes for each new CPU. Without having the same CPU in both BSPD and FSPD processes (never gonna happen...) there's no way to tell how much is due to which -- but as an example BSPD will have the biggest effect on power in the high-power state (when processing), other CPU design aspects like sleep modes or performance/power core optimization will have the biggest power saving impact in the low-power states where typically most of the time is spent.
 
That may be partly due to the process, in a CPU like this BSPD will give more speed at the same power or lower power at the same speed than FSPD. And if you have a faster process you can always decide whether to increase the clock rate (keep similar gate structure) or reduce power while keeping the same clock rate (smaller gates) -- don't forget that all this also changes with voltage and transistor type, it's a *really* complex optimization problem...

But it can also be down to the CPU design/architecture or power/voltage control strategy, which changes for each new CPU. Without having the same CPU in both BSPD and FSPD processes (never gonna happen...) there's no way to tell how much is due to which -- but as an example BSPD will have the biggest effect on power in the high-power state (when processing), other CPU design aspects like sleep modes or performance/power core optimization will have the biggest power saving impact in the low-power states where typically most of the time is spent.

Agreed - I just primarily wanted to point out that Intel did not actually use BSPD to raise clocks further. Though I would think given how many cores are idling these days even when doing "multiple tasks" -- the voltage stability opportunities provided by BSPD at idle/sleep should help increasingly too?

The efficiency piece is interesting too as I've seen some conjecture that TSMC N3 is a more efficient process than 18A, but the 18A Panther Lake is definitely exceeding the efficiency of previous-gen Intel products on N3B. (Definitely not apples to apples here, but worth noting).
 
Agreed - I just primarily wanted to point out that Intel did not actually use BSPD to raise clocks further. Though I would think given how many cores are idling these days even when doing "multiple tasks" -- the voltage stability opportunities provided by BSPD at idle/sleep should help increasingly too?

The efficiency piece is interesting too as I've seen some conjecture that TSMC N3 is a more efficient process than 18A, but the 18A Panther Lake is definitely exceeding the efficiency of previous-gen Intel products on N3B. (Definitely not apples to apples here, but worth noting).
Not sure what you mean by this -- estimates are that just the lower power mesh impedance / access resistance from BSPD gives a speedup of a few percent, but mostly at high voltages/currents i.e. maximum clock rate. Once you lower the voltage/current/clock rate this gain decreases because the voltage drops become smaller.

At the low voltages typically used for idle/sleep the benefit is negligible -- the limit here is that if you drop the voltage too close to transistor threshold voltages the timing mismatches due to random transistor variation go through the roof, so timing closure becomes more and more difficult. The tools then force you to add extra ultra-low-voltage timing margins which increase power consumption, negating some of the saving and making things worse (lower speed, higher power) at high voltages/clock rates.

So there's a complex tradeoff to do depending on the use case of the chip, things you do to save idle/sleep power tend to increase active power and vice versa -- so you also need to know how much time the chip is expected to spend in which mode.

For a CPU with lots of different modes and dynamic clock rate/voltage control this is a horrendously complex optimization problem, and it's a bit like whack-a-mole -- something you do to push down power in one mode is likely to push it up in a different mode. And if you compare different CPUs with different design choices, which one wins will depend on what the chip is doing...
 
Speaking as someone who has actually done thermal analysis comparison of FSPD vs. BSPD in real high-performance chips vs. high-school posting about copper thermal conductivity...

In a conventional FSPD chip the die is face down, the transistors are embedded in (or thermally directly connected to) a thick (typically 700um) silicon substrate, which has about half the thermal conductivity of copper. This conducts heat to the back of the die where the heatsink contacts it, as a consequence something like 90% or more of the heat leaves the chip this way. Even with this you get significant hotspots in regions with high power density, either at block level or all the way down to gate level, typically the hottest devices run maybe 20C hotter than the die backside -- the substrate also wicks heat away laterally (heat spreading) which helps keep temperatures down.



It's worth noting that the hotspot problem with BSPD is likely to get worse in future, because with every new process node the power density per mm2 tends to creep up -- and this will get even worse when stacked CMOS comes along, especially for the "bottom" transistor, which will also increase power density further. Should still be fine for HPC with nice cold liquid cooling (which will however get more challenging...), but even worse for everyone else without such a luxury... :-(

P.S. All the numbers above are examples for high-speed high-power-density circuits -- the actual temperature rises will be smaller for a lot of chips, but the difference will still be there... ;-)
P.P.S. TSMC will do a BSPD version of A14 about a year later, as a next-node follow-on to A16 for the same markets (e.g. HPC)
P.P.P.S. This is also the sensible order, FSPD first to MP to drive down the D0 curve, then BSPD for the reticle-sized HPC chips which really *need* super-low D0...
Thanks,really helpful.
Miners are most likely to be liquid cooling too.
 
Back
Top