Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/index.php?threads/gaa-backside-power-delivery-and-the-secret-plan-of-intel-by-anastasia.20257/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

GAA, Backside Power Delivery and the Secret Plan of Intel by Anastasia

Daniel Nenni

Admin
Staff member
The world runs on silicon chips. Almost all of the world’s chip supply today—about 90% of it—comes from TSMC fabs. They started with 3-micron technology in 1987 and have now progressed to mass-producing chips at 3nm. Recently, they announced new technology that enables chips at 1.6nm.


Source: TSMC
These new transistors involve two interesting innovations: novel transistor architecture and backside power delivery. This has never happened before—a separation of the power interconnect from the signalling. As a chip designer, I can tell you that this is a big deal for the entire industry. To understand the complete picture, let’s start with transistors first.

Transistor Evolution​

All modern computer chips are made up of transistors—tiny electrical switches that can be turned on and off. This is what a classical planar transistor looks like. It contains a gate, a source, and a drain arranged in a two-dimensional plane.


Planar transistor

The device is controlled by the gate, and when we apply a certain voltage, or more specifically, a certain electric field to the gate, it opens the gate and current flows from the source to the drain.

As planar transistors have been scaled down, we’ve shrunk the size of the transistor, specifically the channel. Here we faced many problems, with excessive leakage being just one of them. Eventually, the solution was to completely change the transistor—from planar 2D transistors to three-dimensional FinFET transistors. Basically, they took a planar transistor and stretched the channel up as a vertical fin.


Planar vs FinFET vs GAA transistors. Source: Samsung

While in a planar transistor the conductive channel is only on the surface, with FinFET we have a conductive channel on three sides, while the gate is wrapped around it. Compared to the original planar transistor, FinFETs are more compact, so with FinFET we are now able to pack more transistors onto the same silicon die.

The first commercial FinFET devices were introduced by Intel in 2011, when I was still in university. A few years after Intel’s first FinFET device, Samsung and TSMC started fabricating 16nm and 14nm FinFET chips. Since then, TSMC has led the evolution of FinFET. Nowadays, all the cutting-edge chips are built with FinFET. For example, the latest AMD and Apple chips use 5nm or 3nm FinFET technology.

Gate-All-Around​

However, FinFET technology has already reached its limit in terms of how much we can squeeze it in, how high fins can go, and how many fins can be placed side by side. Once again, high leakage has become a huge problem. Hence, to further shrink transistors and drive down costs, the whole industry is now moving to the new Gate-All-Around (GAA) transistor technology.

I’ve talked about it for years now, but it’s finally going into mass production. TSMC will shift to GAA technology for their N2 process node. They call it “nanosheet transistor”, but at its core, it’s essentially the same thing, just another term for the same concept. TSMC plans to begin production of chips based on GAA technology in early 2025, with the first ones expected to appear in iPhones.



Basically, they took the FinFET structure and turned it horizontally, placing several of these sheets on top of each other so that we can multiply the number of fins vertically. The best part is that the gate is completely wrapped around the channel, allowing us to better control it. With this innovation, we can slightly reduce operating voltage and significantly reduce leakage current. This will give us about 15% improvement in speed and transistor density, but the biggest gain with this technology is in power efficiency. GAA transistors consume up to 35% less power than FinFET technology, and this is huge. This is crucial for applications like mobile chips, where it could significantly prolong battery life, or for AI or HPC applications, which are usually very densely packed and power-hungry.

Backside Power Delivery

Earlier this month, TSMC debuted A16 technology on their roadmap, where 'A' stands for angstrom. TSMC’s A16 technology will be based on nanosheet transistors with one very interesting twist—backside power delivery. This innovation will be a game-changer in terms of power efficiency—let me explain.



Ever since Robert Noyce made the first integrated circuit, everything has been located on the top, on the front side of the wafer, with all the signal interconnect and power delivery coming from the front side.

Backside power delivery is a huge change because we will move power lines underneath the substrate to free additional space for routing on the top. You know that when it comes to modern chips, there are billions of transistors that are interconnected with each other; so there are many levels of signal interconnect going over the chip. At the same time, there is a power mesh on top, which is a network of power and ground lines that distribute power across a semiconductor chip and provide the power supply to the transistors. Currently, all of the interconnect and power delivery come from the top, in different metal layers. Now imagine that when we move all the power to the backside, this will massively reduce the complexity of the wiring, letting us place and route transistors more densely and improve congestion.



Frontside vs Backside Power Delivery. Source: Intel
This concept of separating power from signals will give more freedom to the routing Electronic Design Automation (EDA) tools. This change will not only affect the manufacturing flow but also the chip design itself. It will require a lot of learning throughout the flow, especially when it comes to the power mesh and heat dissipation.

TSMC will start producing chips based on A16 technology in 2026. I’m really looking forward to seeing how it goes. Of course, TSMC is not the only one working on this innovation. Intel is also trying to regain its position in the chip manufacturing race by working on backside power delivery as well as other upgrades.


Intel’s Moonshot​

I want to spend some time discussing Intel’s ambitions because there are several interesting aspects to this story. For the past five years, Intel has lagged behind TSMC and Samsung in advanced chip manufacturing. But now, they plan to be the first, even ahead of TSMC, to bring new transistor and power-delivery technology into production.

For Intel, GAA technology and backside power delivery are coming together in the 20A process node. They are now putting the final touches on it. This 20A node is crucial for Intel. It's a risky move for Intel because typically, you want to introduce innovations one by one to understand where the problems are coming from. Introducing two new technologies at once means Intel is going “all in.” This is clearly a “moon shot” for Intel, with a lot of risk, because the probabilities multiply.

Interestingly, in the past, Intel used to be conservative while TSMC was more risky. This time, it’s the other way around. Intel needs to secure large buyers to reach a high volume and make the economics work because chip manufacturing relies on economy of scale.


Intel’s Five Nodes in Four Years Promise. Source: Intel

In 2021, Intel CEO Pat Gelsinger promised investors and customers five nodes in four years. They have to deliver this time. They currently have Intel 4 and Intel 3 FinFET technologies in production and plan to mass-produce Intel 20A by the end of 2024. Arrow Lake will be the first Intel CPU to feature GAA (they call it RibbonFET) transistors and backside power delivery, which Intel calls PowerVia.

Intel 14A and New High-NA EUV

The most interesting milestone on Intel’s roadmap is the 14A process node, planned for 2027. This involves a significant update: using the new High-NA EUV lithography machines from ASML, each costing $380M. This comes with a lot of risk. Apart from the risks associated with new tooling, the economics of High-NA haven't worked so far.


High-NA EUV machine. Source: ASML

In the competition between TSMC and Intel for sub-3nm nodes, it comes down to who can produce it first with good yield and at the minimum cost. High-NA EUV machines are not yet economically viable, with a high price per wafer. This is why TSMC is passing on this machine for now.

Direct Self-Assembly

At the moment, with High-NA EUV machines, the lithography process takes more time per wafer. This limits fab throughput and drives costs up. To make the economics work, Intel plans to use direct self-assembly. To put it simply, the wafers are covered with PMMA (poly methyl methacrylate) and baked. In this process, the polymer materials self-organize into tiny lines. Research suggests that EUV machines can help to guide this process on the wafer. However, this approach has been in the research phase for at least a decade due to high defect rates.



"Cellular automata method for directed self-assembly modeling", Matyushkin, Litavrin, 2019
I’m rooting for Intel here, but given the innovations Intel is trying to pull together in the next few years, the risks are high. Let me know what you think in the comments. I hope Intel can make it happen because I love what they’ve done for the industry. If they manage to achieve a decent yield, especially with the 14A node, it will be a pivotal moment in Intel’s history, attracting some of the biggest customers and boosting their stock.


Source: ASM

The advanced FinFET and GAA transistor architectures wouldn't be possible without ASM's equipment and process technology. Let me explain.

As we scale transistors down, we need to deposit ultra-thin layers, making precise techniques like Atomic Layer Deposition (ALD) essential. ALD allows the deposition of materials on the wafer atom by atom, creating layers just one atom thick. Leading fabs like TSMC and Intel use ALD machines from ASM for this purpose. ASM, a Dutch semiconductor equipment company, is a pioneer in ALD technology with a 55% global market share. Learn more about ASM and their products here. Thank you, ASM, for sponsoring this edition of Deep In Tech Newsletter.

 
The points missing from all Intel's presentation about PowerVia (and most of TSMCs) is that the performance/area improvement depends heavily on power grid density, there's only a big increase for high-speed high-power-density chips with a heavy power grid which takes up a lot of space. Which is the case for Intel CPUs, but not for a lot of ASICs.

There's also a wafer cost increase (according to TSMC) which is considerably bigger than the area saving even for high-power-density devices, which means cost per gate will be higher than with frontside power.

And finally TSMC only recommend using this for "actively cooled" devices, meaning ones where the heatsink temperature can be kept down, for example by liquid cooling. This is because the heat path is now up through the oxide layers (high thermal resistance) not down through the substrate (low thermal resistance), which means the transistors and fine-pitch metal layers (where EM problems are) run much hotter for a given heatsink temperature.

So BSP is great for some applications but not all, and may be impossible to use on ones with less effective cooling.
 
Last edited:
The points missing from all Intel's presentation about PowerVia (and most of TSMCs) is that the performance/area improvement depends heavily on power grid density, there's only a big increase for high-speed high-power-density chips with a heavy power grid which takes up a lot of space. Which is the case for Intel CPUs, not not for a lot of ASICs.

There's also a wafer cost increase (according to TSMC) which is considerably bigger than the area saving even for high-power-density devices, which means cost per gate will be higher than with frontside power.

And finally TSMC only recommend using this for "actively cooled" devices, meaning ones where the heatsink temperature can be kept down, for example by liquid cooling. This is because the heat path is now up through the oxide layers (high thermal resistance) not down through the substrate (low thermal resistance), which means the transistors and metal layers run much hotter for a given heatsink temperature.

So BSP is great for some applications but not all, and may be impossible to use on ones with less effective cooling.

Nice explanation. TSMC did say that their version of BSPD (SPR) is for HPC applications with complex signal routes and dense power delivery network which suggests temperature controlled. It is a pretty big market, it will be interesting to see which fabless companies give it a try. Clearly customers are asking otherwise TSMC would not do it.
 
Nice explanation. TSMC did say that their version of BSPD (SPR) is for HPC applications with complex signal routes and dense power delivery network which suggests temperature controlled. It is a pretty big market, it will be interesting to see which fabless companies give it a try. Clearly customers are asking otherwise TSMC would not do it.
There's no doubt that TSMC will get customers for A16, including some big ones, for who it has significant advantages. I'm just trying to calm down the over-enthusiastic reception in some quarters who think it's a sexy new technology which is going to take over the silicon world and make things much better for *everyone*, because they only hear what Intel is saying about BSP and see the headline "best-case" figures without any qualification -- including cost and thermals... ;-)
 
The points missing from all Intel's presentation about PowerVia (and most of TSMCs) is that the performance/area improvement depends heavily on power grid density, there's only a big increase for high-speed high-power-density chips with a heavy power grid which takes up a lot of space. Which is the case for Intel CPUs, not not for a lot of ASICs.
Losing the power rails (and not having a keep out zone for the TSV/BPR) should offer a big density improvement. Of course that only matters if the FEOL is small enough for this to matter, and with GAA having lower footprint at iso drive a well designed standard cell should see a significant area reduction. If not perahps we see A14 adopt a narrower 4 sheet device for better short channel effects/to catch up the FEOL scaling to the BEOL scaling? Of course the benefits of BSPD can be blunted. Using N3 as an example due to their gate endcaps (especially on SRAM bitcells where there are also just less power rails to remove). N3's 2-1 mixed height library would presumably also see reduced benefit than the 2-2 or 2-3 as the 1 fin transistor appears to share power delivery with the 2 fin device. Since TSMC said N2 would have mixed row I would assume the latter issue will blunt some of the area benefits for A16, and of course it remains to be seen if they will keep their current endcap architecture or move to something else with N2 or N3E.
There's also a wafer cost increase (according to TSMC) which is considerably bigger than the area saving even for high-power-density devices, which means cost per gate will be higher than with frontside power.
I have no clue how it is even possible for their backside contact to be more than a 10% wafer cost increase, but if that is what they say I guess that is what it is... Regardless of the magnitude the cost increasing that would indicate that TSMC isn't using the old M2/3 pitch or something as the new M0, but this also seems at odds with how low the density they are projecting. Assuming TSMC still does 6 track short libs then they can lower that to a minimum of 4 track (assuming the FE can accommodate). What strikes me as the most likely explanation might just be that TSMC is allowing wide nanosheets for their HD lib. If this is the case though I have to wonder why they didn't just take a page from intel's playbook and increase metal pitches to get a wafer cost reduction? The whole point of more density is better cost per FET. So if you weren't getting it why not just move to a same or slightly smaller cell height with coarser metal lines?
And finally TSMC only recommend using this for "actively cooled" devices, meaning ones where the heatsink temperature can be kept down, for example by liquid cooling. This is because the heat path is now up through the oxide layers (high thermal resistance) not down through the substrate (low thermal resistance), which means the transistors and metal layers run much hotter for a given heatsink temperature.

So BSP is great for some applications but not all, and may be impossible to use on ones with less effective cooling.
It would seem intel's design enablement team is unsurprisingly ahead of TSMC's when it comes to supporting BSPD chip designs, because intel said they were able to find some tricks to have matched thermals between intel 4 and intel 4 + powerVIA. Although this might just be an artifact of N2 having a higher thermal density than intel 4 making the thermals harder for TSMC to manage than it is for intel's 5"nm" class node.
1716484108041.png
 
It would seem intel's design enablement team is unsurprisingly ahead of TSMC's when it comes to supporting BSPD chip designs, because intel said they were able to find some tricks to have matched thermals between intel 4 and intel 4 + powerVIA. Although this might just be an artifact of N2 having a higher thermal density than intel 4 making the thermals harder for TSMC to manage than it is for intel's 5"nm" class node.

The real test for Intel BSPD is foundry customer adoption. Using it internally is one thing, mass producing for outside customers is quite another. Same with HNA-EUV. Either way, Intel is pushing TSMC and Samsung which is for the greater good, absolutely.
 
Losing the power rails (and not having a keep out zone for the TSV/BPR) should offer a big density improvement. Of course that only matters if the FEOL is small enough for this to matter, and with GAA having lower footprint at iso drive a well designed standard cell should see a significant area reduction. If not perahps we see A14 adopt a narrower 4 sheet device for better short channel effects/to catch up the FEOL scaling to the BEOL scaling? Of course the benefits of BSPD can be blunted. Using N3 as an example due to their gate endcaps (especially on SRAM bitcells where there are also just less power rails to remove). N3's 2-1 mixed height library would presumably also see reduced benefit than the 2-2 or 2-3 as the 1 fin transistor appears to share power delivery with the 2 fin device. Since TSMC said N2 would have mixed row I would assume the latter issue will blunt some of the area benefits for A16, and of course it remains to be seen if they will keep their current endcap architecture or move to something else with N2 or N3E.

I have no clue how it is even possible for their backside contact to be more than a 10% wafer cost increase, but if that is what they say I guess that is what it is... Regardless of the magnitude the cost increasing that would indicate that TSMC isn't using the old M2/3 pitch or something as the new M0, but this also seems at odds with how low the density they are projecting. Assuming TSMC still does 6 track short libs then they can lower that to a minimum of 4 track (assuming the FE can accommodate). What strikes me as the most likely explanation might just be that TSMC is allowing wide nanosheets for their HD lib. If this is the case though I have to wonder why they didn't just take a page from intel's playbook and increase metal pitches to get a wafer cost reduction? The whole point of more density is better cost per FET. So if you weren't getting it why not just move to a same or slightly smaller cell height with coarser metal lines?

It would seem intel's design enablement team is unsurprisingly ahead of TSMC's when it comes to supporting BSPD chip designs, because intel said they were able to find some tricks to have matched thermals between intel 4 and intel 4 + powerVIA. Although this might just be an artifact of N2 having a higher thermal density than intel 4 making the thermals harder for TSMC to manage than it is for intel's 5"nm" class node.
View attachment 1933
I believe the figures I saw for area/speed/power changes vs. power grid density (which I can't disclose) came from actual TSMC like-for-like layouts (same basic libraries) not theoretical calculations, so I'm inclined to believe them.

The quoted wafer cost increase we were given for A16 over N2P was *way* above 10% -- extra steps, new process, limited volumes, non-standard processing or whatever, TSMC didn't give reasons, just a price :-(

I'm sure Intel's figures for thermals are correct *for the particular case they chose to analyse*; whether this is realistic in real layouts is another matter entirely, they've been "economical with the truth" often enough in the past. Physics says the thermal resistance with BSP will be higher than with FSP, assuming the heat path is out through the non-bump side of the die which is the usual case -- maybe they chose a case where significant heat goes out through the bumps? (which is not normally the case).
 
Using it internally is one thing, mass producing for outside customers is quite another.
I fail to understand how you have come to that conclusion. Please enlighten me Dan, because I assume I am misunderstanding you. While getting external customers used to designing around BSPD is a must if intel wants their foundry business to succeed. But from a purely technical point of view if someone like an Apple can't figure out how to design a good BSPD chip that isn't really a red mark against the technology but against some combination of the PDK quality/process ease of use/3rd party IP quality/the chip designer. From the perspective of technological capability if intel can make 20/18A in the many millions of dies per year and with high yield I don't think it matters which companies and how many use it. As an example of what I mean, it wasn't "a problem" that intel 45nm had only one customer (being intel) vs TSMC's swarm of 90nm customers. Nobody would say intel's high-K didn't work because Nvidia didn't have a chip on i45, because by that logic intel has been behind TSMC technologically since TSMC's founding.
I believe the figures I saw for area/speed/power changes vs. power grid density (which I can't disclose) came from actual TSMC like-for-like layouts (same basic libraries) not theoretical calculations, so I'm inclined to believe them.
I don't doubt the numbers (whatever they are) I just don't know how the results are what they are since I am running on an incomplete set of data and what seem to me as reasonable deductions based on publicly available data. Now maybe cell level density really correspond to chip area reductions anymore, but I don't have the knowledge to say if this is true or not. If this is the case I assume that would be due to routing conflicts? If routing conflicts are masking cell level density improvements then how is it possible for only certain kinds of HP chip designs to be restricted by routing? Either way; something is going on that is reducing the chip level density of A16 relative to what you would expect from depopulating the power rails from all std cells, a it isn't theory that you can depopulate the 2 power rails with backside contact. That is a fact that is as true as finFETs having superior channel control to planar devices.
The quoted wafer cost increase we were given for A16 over N2P was *way* above 10% -- extra steps, new process, limited volumes, non-standard processing or whatever, TSMC didn't give reasons, just a price :-(
Then I would expect TSMC's margins to go up once N2/A16 get past their first year or two of HVM. A 10% wafer cost adder is insane for "just" adding BS-contacts and two or three metal layers that can be patterned with 193nm or 248nm steppers. For reference multiple manufactures said going from a planar FEOL to a finFET FEOL was a 5-10% wafer cost adder. Adding ontop of that, the baseline price of an N2 wafer is much higher than a 32/28nm wafer. The only other explanation than TSMC is taking a bigger cut to reflect the value add for HPC customers is that their BSPDN process is very inefficiently architected (I suspect the former is more likely than the later, but maybe it is a bit of column A and B).
I'm sure Intel's figures for thermals are correct *for the particular case they chose to analyse*; whether this is realistic in real layouts is another matter entirely, they've been "economical with the truth" often enough in the past.
1716489224108.png
1716489234877.png

They did an E-core cluster chiplet with on-die thermal test structures.
Physics says the thermal resistance with BSP will be higher than with FSP, assuming the heat path is out through the non-bump side of the die which is the usual case -- maybe they chose a case where significant heat goes out through the bumps? (which is not normally the case).
That reminds me! I think I saw Tanj talking about the idea of using the bumps/thick metals to carry heat out of the die and maybe one die doing some cool in die liquid cooling or heat pipes with them. I don't know very much about flipchip packaging, would there be big issues trying to run heat out of the bumps? As for things that are "normally the case", normally the metals are only above the transistor ;).
 
I fail to understand how you have come to that conclusion. Please enlighten me Dan, because I assume I am misunderstanding you. While getting external customers used to designing around BSPD is a must if intel wants their foundry business to succeed. But from a purely technical point of view if someone like an Apple can't figure out how to design a good BSPD chip that isn't really a red mark against the technology but against some combination of the PDK quality/process ease of use/3rd party IP quality/the chip designer. From the perspective of technological capability if intel can make 20/18A in the many millions of dies per year and with high yield I don't think it matters which companies and how many use it. As an example of what I mean, it wasn't "a problem" that intel 45nm had only one customer (being intel) vs TSMC's swarm of 90nm customers. Nobody would say intel's high-K didn't work because Nvidia didn't have a chip on i45, because by that logic intel has been behind TSMC technologically since TSMC's founding.

The challenge with external customers is that they are used to working closely with TSMC who shares secret sauce with key customers. In fact, most of the technology TSMC offers is developed in partnership with their ecosystem. That is not the case with Intel. Hopefully BSPD was developed with outside customers in mind but that is not necessarily the case. I have asked around and have found no customers who were privy to BSPD before it was announced.

Time will tell but I think it will be a while before a full chip with BSPD will be in HVM from Intel Foundry or TSMC. My bet would be TSMC will win that race since they have the customer support. It is like sailing with the tide versus against it.
 
I skimmed this, but i can only assume Anatasia’s content was AI generated.

She clearly missed TSMC not producing 90% of silicon: https://semiwiki.com/semiconductor-...no-tsmc-does-not-make-90-of-advanced-silicon/
(Yes this is in reference to advanced silicon, but it’s not like there aren’t tons of other fab companies including Samsung out there doing older stuff too)

And “Robert Noyce making the first IC” is a bit misleading; he made the first monolithic IC, but Kilby invented the IC before this.
 
I skimmed this, but i can only assume Anatasia’s content was AI generated.
She clearly missed TSMC not producing 90% of silicon: https://semiwiki.com/semiconductor-...no-tsmc-does-not-make-90-of-advanced-silicon/
(Yes this is in reference to advanced silicon, but it’s not like there aren’t tons of other fab companies including Samsung out there doing older stuff too)
And “Robert Noyce making the first IC” is a bit misleading; he made the first monolithic IC, but Kilby invented the IC before this.

As it turns out she is real and is a chip designer. Give her some time and she will be an internationally recognized semiconductor expert just like me :ROFLMAO:
 
Back
Top