Updating our current logic density benchmarking methodologies

IanD · Feb 20, 2024

Tanj said:
Well, they do have a track record of talking down their competition. Rather like how Steve Jobs swore that humans only needed 72dpi screens when Windows supported high res for years, right up until Apple shipped Retina. FWIW, Intel claims it reduces cost.

You have a pretty much unobstructed access to power with the Intel approach (which is not the one IMEC has published) with very low resistance vias. Any source/collector can connect without fuss, from the look of it. The supply connections look very regular, leaving freedom for the signal lines. Hard to see why that makes things more difficult for analog, or for digital.

Emphasis on working with the tools vendors. Intel have likely learned that lesson.

I agree, it has risks. I see those mostly around the integrity and reliability of the very thin remaining silicon layer. Heat removal may be too, though there will be a lot of copper in the backside and a thin distance to the heat removal solution.

Overall I think it is a smart bet for Intel. One of the few things that could conceivably put them out front again, after years. Shows their engineers - and their managers - still have gumption.

Don't see where "talking down the competition" comes from -- we have no intention of using Intel until their level of support for foundry customers and IP availability is comparable with TSMC, which I suspect will never happen, so there's no incentive for TSMC to talk down BSP. In fact the opposite is true, they've been clear about its advantages -- but also the disadavantages in the short term, especially cost and IP availability which they see as hugely important, the TSMC IP ecosystem is one reason they are so successful.

Nobody is arguing about the technical advantages of BSP, they're crystal clear -- what I'm talking about here is the commercial and risk issues of switching to it, especially IP support. If you've never done any tricky N3 layout (e.g. high-speed analogue) you won't have any idea about just how much time and effort it actually takes to not just find the best solution -- which is not always obvious because of the many restrictive and interacting rules -- but get a DRC-clean layout, IIRC the manual is over 1600 pages long -- and I expect N2 will be even worse. IP suppliers are not going to invest in doing this -- and especially the BSP learning curve -- unless they're convinced there will be enough customers for their IP, which I think is by no means clear for N2 though will almost certainly be the case for the following node.

There's also the question of what benefits it delivers for the customer, and again it's not clear that N2 with BSP is worth it for many ASIC customers -- we are a TSMC customer, and knowing what our products are and what BSP would do for them TSMCs advice was that BSP was unlikely to be appropriate for us in N2 -- though it is for things like CPUs and HPC which have very different priorities, so you're right to say that it *is* the right decision for Intel because this is where most of their products are.

The same can be seen for the basic process, including density -- Intel has always made choices which prioritise speed over density and cost (including yield!), because it's the right thing to do for high-margin high-power high-speed CPUs. TSMC have put more priority on density, cost and yield because this is what the majority of their customer base wants -- though recent processes have also had a lot of options (both raw process and DTCO) targeted at HPC and CPU applications, so this isn't as true as it once was.

So BSP is undoubtedly the right choice now for Intel, and -- assuming no killer problems emerge! -- will be for mainstream TSMC processes in future. But I still think that N2 with BPD will be a bit of a "dip-your-toes-in-the-water" process for TSMC, to see how much traction it actually gets from their customers, and that most will wait until the following node to go down the BSP route.

It's not all about the raw technology and its advantages, in real life other things are as or more important... ;-)

Fred Chen · Mar 1, 2025

nghanayem said:
As many on this forum are aware, maximum theoretical logic density is often calculated by taking the (M2 pitch) X (M2 tracks for a four transistor NAND gate) X (CPP). From there we try to use correction factors to account any boundary scaling (for example Scotten using 10% area reduction from going to SDB from DDB, as xDB doesn't show up on the level of individual cells) and uHD SRAM bit density.

With the advent of mixed row logic libraries, I think it best if we decide what methodology we want to use for measuring the cell height of mixed row libraries.

Using N3B as our example case (since to my knowledge that is the first node to ever try mixed row); we could either do a geo mean of the CHs for the 2-1 lib or use the CH of the 2 fin lib. I do not think using the CH for the 1 fin device makes any sense since it only has like 3 M0 tracks and is incapable of making logic circuits without siphoning off wiring resources from nearby 2 fin devices. Conversely we could only use the 2 fin device, with the caveat that you can go denser if you use "finFLEX" (in a similar manner to the correction factors applied for SDB). However I feel that it would be easier to go with the geo mean method as we won't need to muck around trying to compute what a good correction factor for different nodes and or products would be.

Any and all additional thoughts and discussions are welcome!

It seems this formula has been around for years, but not necessarily followed (from https://www.angstronomics.com/p/the-truth-of-tsmc-5nm):

Going directly by the picture, it works out to 1.56 transistors/CPP/Cell height.

It hasn't been followed by everyone obviously. One example is Wikichip for N3E: https://fuse.wikichip.org/news/7375/tsmc-n3-and-challenges-ahead/

Here the formula used was 1.48 transistors/CPP/cell height to get 215.6 MTr/mm2 for 48 nm CPP and 143 nm cell height.

Besides the effective number of transistors, the CPP and cell height can possibly have multiple values. Perhaps 2 N fins + 2 P fins is used as standard, but with nanosheets, how will this work?

OneEng · Mar 1, 2025

I guess my methodology would be to provide PPA, density, etc in terms of use cases:

SRAM
CPU Logic
GPU Logic
AI?
ETC.

This would be better for sure (than what is done today); still, it really matters what you are making the design for. A CPU designed for low power operation is likely to have a pretty different transistor make up than one made for maximum performance. A CPU designed for radiation hardening (extreme example) will likely have very very different transistor makeup than others.

More than this metric, I would like to see a better standard for yields developed. That too is quite tricky, but as we have seen lately, since it is so poorly defined, lots of people (not here of course) don't understand what they hear and the misunderstanding can impact a business (your average investor is hardly at the knowledge level of those posting here).

Tanj said:
It is interesting that transistor scaling has serious limits in physics due to leakage, channel length minimum, and doping, while wiring is nowhere near its limits except for lithography. Sure, wiring has issues of increasing resistance but it does not simply fail to work if it gets, say, 2x as small whereas even with 2D materials we are not seeing channel length cut in half. And even when we do advance transistors some things do not improve - last I saw, ribbons do not have lower capacitance than fins. Yet we do not see a lot written about advanced processes which highlight how much the wiring might improve independently of the transistors. Seems like there is potential for advances there.

True that. Wiring does have issues with increasing resistance; however, there are still lots of tricks that can be used here. As an example, lots of smaller conductors provide higher current than a single larger conductor since the charge travels on the surface. So while more vias would be needed to do it, the overall area of the vias would be smaller to move the same amount of current.

Silver is a better conductor than copper (but would likely be way too expensive), but high temperature super-conductors are ceramics! Would be a great breakthrough if we could just get the definition of "high temperature" to mean something north of -320 deg F

.

Ah well. I will likely be worm food long before that happens!

Scotten Jones · Mar 2, 2025

Fred Chen said:
It seems this formula has been around for years, but not necessarily followed (from https://www.angstronomics.com/p/the-truth-of-tsmc-5nm):

View attachment 2843
Going directly by the picture, it works out to 1.56 transistors/CPP/Cell height.

It hasn't been followed by everyone obviously. One example is Wikichip for N3E: https://fuse.wikichip.org/news/7375/tsmc-n3-and-challenges-ahead/

Here the formula used was 1.48 transistors/CPP/cell height to get 215.6 MTr/mm2 for 48 nm CPP and 143 nm cell height.

Besides the effective number of transistors, the CPP and cell height can possibly have multiple values. Perhaps 2 N fins + 2 P fins is used as standard, but with nanosheets, how will this work?

You are misunderstanding the methodology, there isn't a "standard number of fins", you have to do the whole analysis for each process/node.

First, you have to take into account diffusion break, the 2-input NAND and Scan Flip Flop width depend on diffusion break. Everyone is using the same technique now but a few nodes ago they weren't and any trend analysis has to account for that by node. For example, a 2-input NAND is 4CPP wide for DDB and 3CPP wide for SDB.

Second, cell height is M2P (except Samsung who numbers metal layers differently) multiplied by tracks. M2P often changes node to node and track height also changes. Track height is really just a ratio of cell height to M2P. To really understand cell height you have to look at the device structure, each cell has a cell boundary, two device heights (nFET and pFET) and an n-p spacing. The device height is the fin or nanosheet pitch multiplied by number of fins or nanosheets per device. In general track heights were 9.00 for 4-fins, 7.50 tracks for 3-fins, and 6.00 for 2-fins but there have been lots different and fractional cells heights particularly from Intel.

At 3nm TSMC has double row cells with 2 fin rows over 1 fin rows and you have to average the cell heights to account for that.

Samsung 3nm HNS follows the same methodology outlined above, there is nothing special about HNS for this analysis if you measure the correct things.

For HNS with BPD track heights will shrink further.

Of course a single process can have multiple cells types such as high density and high performance. You have to pick one and be consistent.

There is also an argument that Scan Flip Flop is no longer the primary type of flip flop, but that is another discussion.

Scotten Jones · Mar 2, 2025

OneEng said:
I guess my methodology would be to provide PPA, density, etc in terms of use cases:

SRAM
CPU Logic
GPU Logic
AI?
ETC.

This would be better for sure (than what is done today); still, it really matters what you are making the design for. A CPU designed for low power operation is likely to have a pretty different transistor make up than one made for maximum performance. A CPU designed for radiation hardening (extreme example) will likely have very very different transistor makeup than others.

More than this metric, I would like to see a better standard for yields developed. That too is quite tricky, but as we have seen lately, since it is so poorly defined, lots of people (not here of course) don't understand what they hear and the misunderstanding can impact a business (your average investor is hardly at the knowledge level of those posting here).

True that. Wiring does have issues with increasing resistance; however, there are still lots of tricks that can be used here. As an example, lots of smaller conductors provide higher current than a single larger conductor since the charge travels on the surface. So while more vias would be needed to do it, the overall area of the vias would be smaller to move the same amount of current.

Silver is a better conductor than copper (but would likely be way too expensive), but high temperature super-conductors are ceramics! Would be a great breakthrough if we could just get the definition of "high temperature" to mean something north of -320 deg F .

Ah well. I will likely be worm food long before that happens!

There are two fundamental problems with characterizing processes using the blocks you outline above.
1) Each block density will depend on the process and the design, not just the process. Two different designs with different design goals done on the same process can have very different densities, let alone different designs on different processes.
2) You won't be able to get the data.

You also leave out analog, IO, and other blocks but they are the smallest area and while I look at them I don't spend much time on them.

The way I compare processes for density in general is:
1) I use measured pitches and other structural elements to calculate the transistor density for high density logic cells. Yes I know no one will hit this density in an actual design, but it compares the density of processes on an apples-to-apples basis without design effects. For processes not yet in production I use the companies announced density improvements to project transistor density for the next node. I can also use low level trend analysis and physical limits to make reasonable projections if a company hasn't announced the density change. I have done this many times and then months or years later gotten actual parts to compare to and I do pretty well.
2) I look at HD SRAM cell size.

In general for current SOC products: logic is ~1/2 the area, SRAM ~1/3, and analog, IO, and other make up the rest. We have a group at TechInsights that publishes incredible breakdowns of die area by function.

I have methodologies for comparing power and performance between processes independent of designs, I outline it in my IEDM TSMC 2nm article: https://semiwiki.com/semiconductor-...nm-process-disclosure-how-does-it-measure-up/ although there is more to it than I am willing to disclose.

For yield all you need is defect density and the yield model being used, its just that you generally need to be under NDA to have access to the data. Certainly the fabless design houses all have that data for processes they are designing too. I have a lot of data as well.

With respect to metal resistance.

You are comparing bulk resistivity, at small dimensions resistivity increases due to electron scattering, Ag is worse than Cu at small dimensions, there are however several metals that are better. You also need good electromigration resistance and Ag's low melting point would mean poor resistance to electromigration. It would also ideally need to be able to be used without a barrier (my guess is no) and dry etched. The best elemental metal resistivity known at small dimensions is Rh, but it doesn't stick to anything, you can't etch it and you can't CMP it. Ru is next best and a lot of people have worked on it, but it is really expensive (~20x more expensive than Ag) and toxic. Mo is getting a lot of attention due to low resistance at small dimensions, you can dry etch it, it can be used without a barrier and it is relatively inexpensive. Lam has said they expect to sell a lot of Mo deposition tools this year, my belief is it will be seen in 3DNAND first for word lines but eventually logic vias and then critical metal lines.

Beyond that graphene with FeCl3 intercalation is better then Ru at small enough dimensions.

OneEng · Mar 4, 2025

Scotten Jones said:
There are two fundamental problems with characterizing processes using the blocks you outline above.
1) Each block density will depend on the process and the design, not just the process. Two different designs with different design goals done on the same process can have very different densities, let alone different designs on different processes.
2) You won't be able to get the data.

You also leave out analog, IO, and other blocks but they are the smallest area and while I look at them I don't spend much time on them.

The way I compare processes for density in general is:
1) I use measured pitches and other structural elements to calculate the transistor density for high density logic cells. Yes I know no one will hit this density in an actual design, but it compares the density of processes on an apples-to-apples basis without design effects. For processes not yet in production I use the companies announced density improvements to project transistor density for the next node. I can also use low level trend analysis and physical limits to make reasonable projections if a company hasn't announced the density change. I have done this many times and then months or years later gotten actual parts to compare to and I do pretty well.
2) I look at HD SRAM cell size.

In general for current SOC products: logic is ~1/2 the area, SRAM ~1/3, and analog, IO, and other make up the rest. We have a group at TechInsights that publishes incredible breakdowns of die area by function.

I have methodologies for comparing power and performance between processes independent of designs, I outline it in my IEDM TSMC 2nm article: https://semiwiki.com/semiconductor-...nm-process-disclosure-how-does-it-measure-up/ although there is more to it than I am willing to disclose.

For yield all you need is defect density and the yield model being used, its just that you generally need to be under NDA to have access to the data. Certainly the fabless design houses all have that data for processes they are designing too. I have a lot of data as well.

With respect to metal resistance.

You are comparing bulk resistivity, at small dimensions resistivity increases due to electron scattering, Ag is worse than Cu at small dimensions, there are however several metals that are better. You also need good electromigration resistance and Ag's low melting point would mean poor resistance to electromigration. It would also ideally need to be able to be used without a barrier (my guess is no) and dry etched. The best elemental metal resistivity known at small dimensions is Rh, but it doesn't stick to anything, you can't etch it and you can't CMP it. Ru is next best and a lot of people have worked on it, but it is really expensive (~20x more expensive than Ag) and toxic. Mo is getting a lot of attention due to low resistance at small dimensions, you can dry etch it, it can be used without a barrier and it is relatively inexpensive. Lam has said they expect to sell a lot of Mo deposition tools this year, my belief is it will be seen in 3DNAND first for word lines but eventually logic vias and then critical metal lines.

Beyond that graphene with FeCl3 intercalation is better then Ru at small enough dimensions.

Thanks.

Question on the yield calculation. How are things like heat density, sustainable frequency, and a host of other limitations within a transistor taken into account. Example: Is it a defect if the transistor can only clock to x when you would like something greater? Is it a defect if it only works until it gets hot and then has to be clocked down?

Seems like the difficulty would be in defining exactly what a "defect" is.

On the question of metals, while silver may have other undesirable characteristics, perhaps alloys would be a better answer? I know in Nuclear Reactor design, alloys are chosen for toughness, strength, and the absence of long term radioisotopes that can be produced under a neutron flux.

Perhaps interconnects could be served similarly?

Scotten Jones · Mar 4, 2025

OneEng said:
Thanks.

Question on the yield calculation. How are things like heat density, sustainable frequency, and a host of other limitations within a transistor taken into account. Example: Is it a defect if the transistor can only clock to x when you would like something greater? Is it a defect if it only works until it gets hot and then has to be clocked down?

Seems like the difficulty would be in defining exactly what a "defect" is.

On the question of metals, while silver may have other undesirable characteristics, perhaps alloys would be a better answer? I know in Nuclear Reactor design, alloys are chosen for toughness, strength, and the absence of long term radioisotopes that can be produced under a neutron flux.

Perhaps interconnects could be served similarly?

With respect to yield, the defect density is based on electrical test of test chips designed to the pdk, I think the basic idea is the pdk is supposed to define all the design rules you need to meet to yield to the extracted defect density, but they aren't perfect and get updated periodically. This is starting to move outside my comfort zone for expertise. My sense is the current yield models are adequate, I think the issue is the data isn't public so its hard to figure out what it means when someone says a certain companies process is only yielding 10% (plus the reported yield is typically an unsupported rumor).

With respect to metals, there is research being done on alloys but at small dimensions they are all worse than pure metals, at least so far. I know Al3Sc, Al2Cu, and AlCu have been tested, ironically Al is better than Cu for resistivity at small dimensions but has other issues. Mo and eventually Graphene appear to be acceptable solutions for at least 10 more years.

I have detailed calculations and roadmaps into the mid to late 2030s, I don't see any unsolvable problems to continue scaling although more slowly and there are cost issues creeping in.

DanX · Mar 4, 2025

Scotten Jones said:
With respect to yield, the defect density is based on electrical test of test chips designed to the pdk, I think the basic idea is the pdk is supposed to define all the design rules you need to meet to yield to the extracted defect density, but they aren't perfect and get updated periodically. This is starting to move outside my comfort zone for expertise. My sense is the current yield models are adequate, I think the issue is the data isn't public so its hard to figure out what it means when someone says a certain companies process is only yielding 10% (plus the reported yield is typically an unsupported rumor).

Actually, I think the current yield formula is too strict. It would be better to have a yield based on probability. This is because chips should be like s, with small cores working in parallel, which makes them more competitive. Unfortunately, the fab doesn't support me changing the rules myself.
bitcoin chip is an example of parallel computing with very small cores(cells).

OneEng · Mar 6, 2025

Scotten Jones said:
With respect to yield, the defect density is based on electrical test of test chips designed to the pdk, I think the basic idea is the pdk is supposed to define all the design rules you need to meet to yield to the extracted defect density, but they aren't perfect and get updated periodically. This is starting to move outside my comfort zone for expertise. My sense is the current yield models are adequate, I think the issue is the data isn't public so its hard to figure out what it means when someone says a certain companies process is only yielding 10% (plus the reported yield is typically an unsupported rumor).

With respect to metals, there is research being done on alloys but at small dimensions they are all worse than pure metals, at least so far. I know Al3Sc, Al2Cu, and AlCu have been tested, ironically Al is better than Cu for resistivity at small dimensions but has other issues. Mo and eventually Graphene appear to be acceptable solutions for at least 10 more years.

I have detailed calculations and roadmaps into the mid to late 2030s, I don't see any unsolvable problems to continue scaling although more slowly and there are cost issues creeping in.

Thanks. Great information.

Graphene is interesting for a number of reasons. How would it be applied? I am pretty sure you can't vapor deposit it

. It would certainly be inexpensive from a materials standpoint (if the process wasn't so expensive that it made the entire thing unreasonable). Lots of Carbon in the world.

You are probably right on the yields. It is mostly that the information isn't released. This is true for many industries. There are marketing figures, but then no measurement metrics to show how they derived that number..... to the point that the marketing figures are not really useful at all for determining performance.

Search

Updating our current logic density benchmarking methodologies

How do we measure the maximum library density for nodes that support mixed row.

Based on the density of the highest density library that is usable in a standalone fashion

The geo mean of the libraries that make up the highest density configuration

IanD

Well-known member

Fred Chen

Moderator

OneEng

Active member

Scotten Jones

Moderator

Scotten Jones

Moderator

OneEng

Active member

Scotten Jones

Moderator

DanX

Active member

OneEng

Active member