Yes, I saw that afterwards. Makes me wonder where else they would exclude PowerVia.Intel said they were implementing it on the periphery CMOS and the bitcells along the array edges.
View attachment 2825
Array ( [content] => [params] => Array ( [0] => /forum/threads/isscc-n2-and-18a-has-same-sram-density.22126/page-4 ) [addOns] => Array ( [DL6/MLTP] => 13 [Hampel/TimeZoneDebug] => 1000070 [SV/ChangePostDate] => 2010200 [SemiWiki/Newsletter] => 1000010 [SemiWiki/WPMenu] => 1000010 [SemiWiki/XPressExtend] => 1000010 [ThemeHouse/XLink] => 1000970 [ThemeHouse/XPress] => 1010570 [XF] => 2021770 [XFI] => 1050270 ) [wordpress] => /var/www/html )
Yes, I saw that afterwards. Makes me wonder where else they would exclude PowerVia.Intel said they were implementing it on the periphery CMOS and the bitcells along the array edges.
View attachment 2825
Nothing else comes to mind, but SRAM didn't come to mind to me until I saw it... Logic only sees benefits and unlike the SRAM bitcells inside the array has power rails that can be deleted to save space. So I have to assume all logic cells will have it. I guess passives/some analog devices might not have it. But capacitors, inductors, and resistors aren't transistors. Presumably the transistors and logic inside the "analog" parts of the chip all REALLY want powervia for the lower voltage droop. If we are talking long channel devices the density boost from removing power rails would definitely be less impressive because those long channel devices have very large cell widths/poly pitches.Yes, I saw that afterwards. Makes me wonder where else they would exclude PowerVia.
So Vss, Vdd connections for SRAM array are frontside, can't be backside?Nothing else comes to mind, but SRAM didn't come to mind to me until I saw it... Logic only sees benefits and unlike the SRAM bitcells inside the array has power rails that can be deleted to save space. So I have to assume all logic cells will have it. I guess passives/some analog devices might not have it. But capacitors, inductors, and resistors aren't transistors. Presumably the transistors and logic inside the "analog" parts of the chip all REALLY want powervia for the lower voltage droop. If we are talking long channel devices the density boost from removing power rails would definitely be less impressive because those long channel devices have very large cell widths/poly pitches.
Speaking of I wonder if PLLs or parts of the clock tree just completely live on the backside of the wafer or if that is something that comes later once you start seeing fully functional backsides?
I don't think so? But my understanding is a bit shaky. My understanding was that for a 6T SRAM the Vdds and Vss just connect transistors within the bitcell, right? I thought that all electricity into and out of the bitcell flowed through the BL and WL. If that is the case, then only the bitcells at the array edge would need to be getting direct power delivery as the power flows along the bit/word lines to any particular bitcell inside the array. If my understanding is correct it isn't an issue of can't but rather that the nature of an array only having power delivered to the array edges where it then flows to all bitcells across the string. I know you are a DRAM guy, but my understanding was that if you abstract an SRAM bitcell to the bitcell level rather than the transistor level, that an SRAM bitcell and a DRAM bitcell they are wired up to the array and the periphery in a similar manner. With that information, if you feel my understanding of how a bitcell works is wrong, please feel free to correct me because I don't want to be spreading incorrect information.So Vss, Vdd connections for SRAM array are frontside, can't be backside?
is the base die active? or a passive interposerThey are not counting their Base die otherwise it would be >50% ARL SIlicon on Intel cause that tile is bigger than all the TSMC pcs combined in terms of area.
Intel answered on their home page.I forgot about the nanowire count.... excellent point.
So between @nghanayem , @Scotten Jones @IanCutress @IanD
What is the performance and density impact of BSPD modelled to be? I think I heard density number like 7% discussed at IEDM.
Passive interposeris the base die active? or a passive interposer
So we are going to count passive interposers as an area win.... it has come to that?Passive interposer
TSMC said that 4.2 GHz speed was for the HC array not the HD array. The major difference between Intel and TSMC methodologies seems to be the different operating temperatures for the testing. Also from what I have seen SRAM performance and efficiency is not 1:1 with logic, so I am not taking this one metric to the bank as 18A WAY faster than N2 in all aspects. Based on how Intel's Vmin reduction was smaller even at a lower temperature and N3E having presumably lower Vmin than Intel 3, I suspect N2's logic will be faster HP logic to HP logic. It would be funny if we got another intel 4/3 situation where Intel has better HP logic density and better HD logic performance, but TSMC leads in HD density and HP performance being seemingly better.I have a bunch of comments on this thread I am going to roll into one big comment.
With respect to the Fmax Shmoo plots, what isn’t obvious until you read the papers is TSMC’s array is HD cells (plus double pumped although I don’t think that matters for clock speed). Intel’s array is HP cells (what Intel calls HCC). 5.6GHz for Intel versus 4.2GHz for TSMC isn’t an apples-to-apples comparison. I don’t know how different the clock speeds typically are for HD versus HP SRAM on the same process, I would love to hear from anyone here who does.
I thought not having an inner spacer was a performance inhibitor due to how it forces you to make the device with less isolation?Another trade-off is inner spacers, they reduce capacitance but also reduce drive current.
I was kind of shocked by the lack of inner spacer on SF3(E). I didn't even know it was possible to do a proper nanowire release without it TBH.In the Samsung SF3E and SF3 processes they don’t use inner spacers, and they don’t have any dielectric isolation under the nanosheet stack. Samsung is currently using a 3-sheet stack, but they have announced they will go to 4-sheets at 1.4nm.
Doesn't BSPD throw a wrench into that as you are removing the bulk Si? Something I also wondered but never looked around deep enough to find out is if removing the bulk-Si and subfin completely removes leakage through the bulk. I would assume it does, but I don't really see aton of academic attention because you would think that would be a bigger deal if it works like it does in my mind.It will be interesting to see if Intel or TSMC adopt inner spacers, dielectric isolation or one of the techniques to increase hole mobility. Intel has published some interesting work with SiGe pFETs and Si nFETs on a strain relaxed buffer. There is also SiGe cladding of the pFET sheet but that has a bunch of issues.
Yeah... That was certainly not on my bingo cardIntel catching TSMC for SRAM cell size is impressive.
Is it that crazy? Those infernal Self-Aligned-Gate-Endcaps blew out their cell heights from all that extra spacer between the fins/polycuts. And N2 is seemingly following the 20nm 16FF playbook of no litho shrink/minimal density improvement and a new much higher PPA device.Amazingly TSMC N5, N3E, and N2 all have the same SRAM cell size, no shrink! N3 was a little smaller but they had yield issues.
Do you not think that BS signal routing will come before CFET, because that seems like a sizable opportunity for improvement?The real solution to SRAM cell size scaling will be CFETs that could cut the cell size nearly in half.
This is why I am bummed that A16 is only offering such a small density improvement and TSMC's comments on easy porting from N2. Their standard cell seems to still be 6 or 7 M0 tracks tall when they could have it be 4! Hopefully the mobile customers don't demand a FSPDN version of A14 because that would be LAME.In terms of Backside Power Delivery (BPD):
With HNS it is hard to get below a 6-track logic cell height due to the power rails. Intel’s PowerVia and other backside power delivery solutions enable 5-track cells and TSMC’s backside solution in 2027 (maybe 2026 now) offers direct connections for a possible 4-track logic cell height.
You could already put MIM caps in the BEOL before BSPDN. Intel mentions a 3D MIM cap for 18A, but teardowns have already shown 3D MIM caps in intel 4. Certainly explains how they doubled capacitance over Intel 7 and its many capacitor plates (which itself somehow has similar cap to what TSMC claims for their new and improved N2 MIM cap). Unless what you meant is you could more easily put bigger and bigger MIM caps on now that you are on a shorter BEOL stack on the backside of the wafer?Another advantage of BPD is that bringing power through the front of the die means the power must go through the entire via stack to get to the devices. Imec has shown BPD through nano-vias can reduce static power drop by 95% and dynamic power drop by 75%. I estimated a 15 via chain in TSMC 3nm has 560 ohms of resistance (estimated from a plot in TSMC's N3 paper), a nano-via has ~50 ohms (per Imec).
You can also put MIM caps on the backside and as Intel notes in their paper you don’t use PowerVia in the SRAM cells (although you do in the periphery) and you can use that entire backside area under the cell array for a giant negative bit line capacitor without taking up any otherwise useful area.
Amazingly TSMC N5, N3E, and N2 all have the same SRAM cell size, no shrink!
“TSMC said that 4.2 GHz speed was for the HC array not the HD array.”TSMC said that 4.2 GHz speed was for the HC array not the HD array. The major difference between Intel and TSMC methodologies seems to be the different operating temperatures for the testing. Also from what I have seen SRAM performance and efficiency is not 1:1 with logic, so I am not taking this one metric to the bank as 18A WAY faster than N2 in all aspects. Based on how Intel's Vmin reduction was smaller even at a lower temperature and N3E having presumably lower Vmin than Intel 3, I suspect N2's logic will be faster HP logic to HP logic. It would be funny if we got another intel 4/3 situation where Intel has better HP logic density and better HD logic performance, but TSMC leads in HD density and HP performance being seemingly better.
View attachment 2841
View attachment 2842
I thought not having an inner spacer was a performance inhibitor due to how it forces you to make the device with less isolation?
I was kind of shocked by the lack of inner spacer on SF3(E). I didn't even know it was possible to do a proper nanowire release without it TBH.
Doesn't BSPD throw a wrench into that as you are removing the bulk Si? Something I also wondered but never looked around deep enough to find out is if removing the bulk-Si and subfin completely removes leakage through the bulk. I would assume it does, but I don't really see aton of academic attention because you would think that would be a bigger deal if it works like it does in my mind.
Yeah... That was certainly not on my bingo card
Is it that crazy? Those infernal Self-Aligned-Gate-Endcaps blew out their cell heights from all that extra spacer between the fins/polycuts. And N2 is seemingly following the 20nm 16FF playbook of no litho shrink/minimal density improvement and a new much higher PPA device.
Do you not think that BS signal routing will come before CFET, because that seems like a sizable opportunity for improvement?
This is why I am bummed that A16 is only offering such a small density improvement and TSMC's comments on easy porting from N2. Their standard cell seems to still be 6 or 7 M0 tracks tall when they could have it be 4! Hopefully the mobile customers don't demand a FSPDN version of A14 because that would be LAME.
You could already put MIM caps in the BEOL before BSPDN. Intel mentions a 3D MIM cap for 18A, but teardowns have already shown 3D MIM caps in intel 4. Certainly explains how they doubled capacitance over Intel 7 and its many capacitor plates (which itself somehow has similar cap to what TSMC claims for their new and improved N2 MIM cap). Unless what you meant is you could more easily put bigger and bigger MIM caps on now that you are on a shorter BEOL stack on the backside of the wafer?
Great post BTW!
It can also be dramatically cheaper because you don't need 17+ interconnect layers. I haven't seen any of the big logic companies talking about dedicated SRAM processes, maybe as ChipLets catch on it will generate interest.For an HP process, making the cell smaller just by trimming thing a bit, and accepting more leakage was possible I believe. The lithography, and n-patterning techniques are not their fundamental limits, nor with N2, nor with Intel's process. The question how good is their cell while being smaller. For desktop CPUs, a bit more leaky SRAM is easily acceptable.
Latest nodes avail different SRAM types from smallest, and fastest, to more efficient "storage 6T SRAM".
What I was looking many years ago is whether it's possible to make a speciality node just for SRAM, and it came out it's easily possible to have both much smaller, and performant device if you don't need to care about the rest of logic.
GNR was released Sep 2024. Considering the catastrophic performance numbers Intel put up in 2P, I would have thought they would have released a fix by now if it was something easily fixed.Crap. I looked at the wrong chart. Thanks.
There is still something holding back scaling looking at 2P vs 1P for GNR vs Turin.
That alone makes it impossible to judge the difference in node performance. Hopefully you guys read the rest of the post after that mistake.
I kinda get where they are coming from though. It is my understanding that the libraries between A16 and N2 will be compatible. Customers can then easily make test chips on N2 and also on A16 and determine if their designs are better served on one or the other.This is why I am bummed that A16 is only offering such a small density improvement and TSMC's comments on easy porting from N2. Their standard cell seems to still be 6 or 7 M0 tracks tall when they could have it be 4! Hopefully the mobile customers don't demand a FSPDN version of A14 because that would be LAME.
See my post above, BPD is more expensive and the mobile guys don't need it and don't want to pay for it. I am hearing the foundries will have to offer with and without BPD plus different metallization schemes.I kinda get where they are coming from though. It is my understanding that the libraries between A16 and N2 will be compatible. Customers can then easily make test chips on N2 and also on A16 and determine if their designs are better served on one or the other.
I believe that BSPDN isn't a free lunch. In other words, you don't get something for nothing. It likely has disadvantages compared to N2 without BSPDN in some situations .... and certainly in price I would think.
Does that make sense, or do you think that there are no reasonable use cases where FSPDN with GAA would be better than BSPDN?
So then the question I would have is does it make sense to utilize the same libraries for both, or would there be an advantage to making a FSPD library and a BSPD library so you could optimize for each process better?See my post above, BPD is more expensive and the mobile guys don't need it and don't want to pay for it. I am hearing the foundries will have to offer with and without BPD plus different metallization schemes.
I am not a design guy but my guess is the libraries will have to be different.So then the question I would have is does it make sense to utilize the same libraries for both, or would there be an advantage to making a FSPD library and a BSPD library so you could optimize for each process better?
It just seems that the more generic you try to make a circuit, the more inefficient it becomes vs. close to metal thinking (so to speak) where the transistor design and layout is customized for the process as much as possible.
Of course, as with all things in engineering, you don't get something for nothing. Such process specific optimization results in very poor portability to another process, or even sensitivities in the design to even mild changes in the process.