the measure of a new process

Tanj · Mar 5, 2024

Every so often we have a discussion about the compactness of a new process. You know, what does it mean to be a 2nm process when nothing is actually 2nm. Suggestions often include things like transistors per sq mm or the size of some key standard logic cells like NAND2, D-FlipFLop, and SRAM in some sort of weighted average.

But there is a whole other set of dimensions rarely discussed and almost never disclosed:
- Ceff, the effective capacitance of a transistor for the purpose of switching states
- Cw, the average capacitance per micron of wires in the metal layer
- microns of wire average per CMOS pair in some standard circuit like an int8 multiply-accumulate (routing efficiency)
- Vdd, the minimum useful voltage for doing something like running an int8 MAC at 2GHz.

Why are these interesting? Because switching energy, C V^2 / 2, is the limit to efficiency and density in many applications of concern, and if you know these values you can calculate the average switching energy of a CMOS pair in a given process.

Which applications are concerned about this? Oh, little things like AI GPUs, where the arithmetic is crowded to the density limit and operating at 50% per clock switching rate on arithmetic gates and 200% on clock gates/drivers, leading to heat limited density rather than lithography limited density. You can easily exceed 100W/cm2 at a 1.5GHz clock rate with a process as dense as Intel 4 - that exceeds the power density of an H100.

If a CFET can be 4x denser but does not meaningfully improve the capacitance and voltage, then reaching the same clock rate will require as yet unproven technology to remove the intense heat from the chips. Besides which, it does nothing to allow AI be more efficient in terms of delivering functionality per Joule, so it will not help clouds become more environmentally friendly.

Perhaps the relentless march of process density is soon to reach its "Pentium 5" moment of roadmap to irrelevance, where the goal of increased density no longer is the right goal for major markets. We should be looking at energy intensity of switching for some of the most economically valuable markets.

TBiggs · Mar 5, 2024

Everyone who knows chips and Dennard scaling should know but here is a review.

Each new node needs to offer real value to the user. The foundry and IDM needed something catchy and hung on to the reduction of the number by roughly .7 or so from the previous generation after Dennard scaling ended.

Historically the dimensions scaled .7 and the chip density was 2x and ISO power maybe 30% performance. As Dennard scaling ended, material tricks also became harder it has slowed but there is still value in the next node, but the marketing number remains. Can’t call it 2024 process as what does that mean in 2028?

These days thru DTCO and other tricks they get maybe .85 scaling and 15% performance. Chiplets really help by enabling options to only put certain functions on the leading edge and others like IO and memory on older and more cost effective nodes.

Customers of the new nodes go into deep discussion with the technology definition folks to define what they want for what they target and there are always tradeoffs and risk reward decisions. Intel clearly saw that at 10nm!. Nobody is going to share actual numbers for Ceff or other direct number. They will at VLSI or IEDM show some % from last node and relative marketing number.

Don’t worry the economic profit and product and business requires us all to push forward to stay employed and keep business going. And you suckers to buy new gagets. The trillion dollar drive will produce innovations that result in smaller numbers and new products, stop worrying about the number meaning!

Those in the design or process know the exact number, others are just Monday morning QBs, LOL

Tanj · Mar 5, 2024

TBiggs said:
These days thru DTCO and other tricks they get maybe .85 scaling and 15% performance.

Ah, but what is "performance"? Look at SRAM, which stalled around the 7nm level. The ideal cell layout was a known factor and none of the tricks used since then to make smaller logic have been much help for SRAM. It can help shrink array periphery and the rest of the chip, some benefit, but not the cells within the array. CFET is the first node expected to significantly shrink SRAM.

My point is that for some of the most dominant computing structures in front of us now, the intensive, repetitive, and simple arithmetic of AI, we are also already looking at the ideal circuit. And they are heat limited. Packing fins closer together does not reduce capacitance. Changing from fin to ribbon does not reduce capacitance. It will not change voltage, if they take the channel control advantage in the form of reduced Cpp rather than steeper STS at same Cpp. Changing from ribbon to fork, from fork to CFET, do not change voltage and Ceff is projected maybe 7% better going all the way from ribbon to CFET - several nodes for 7%. And if we are not careful the wiring and wiring efficiency could go the opposite direction and add more capacitance.

Heat limited structures will actually get slower if packed more densely. Sure, there are markets like mobile with a lot of dark silicon and the traditional shrink can help - although even they need to figure out how to do a lot more AI compute.

It is not just SRAM that is stalling, a lot of compute is going to stall, too. If we really want to look ahead, we need a roadmap for the energy parameters, and we need process designers to put more emphasis on energy not density.

TBiggs · Mar 5, 2024

Tanj said:
Ah, but what is "performance"? Look at SRAM, which stalled around the 7nm level. The ideal cell layout was a known factor and none of the tricks used since then to make smaller logic have been much help for SRAM. It can help shrink array periphery and the rest of the chip, some benefit, but not the cells within the array. CFET is the first node expected to significantly shrink SRAM.

My point is that for some of the most dominant computing structures in front of us now, the intensive, repetitive, and simple arithmetic of AI, we are also already looking at the ideal circuit. And they are heat limited. Packing fins closer together does not reduce capacitance. Changing from fin to ribbon does not reduce capacitance. It will not change voltage, if they take the channel control advantage in the form of reduced Cpp rather than steeper STS at same Cpp. Changing from ribbon to fork, from fork to CFET, do not change voltage and Ceff is projected maybe 7% better going all the way from ribbon to CFET - several nodes for 7%. And if we are not careful the wiring and wiring efficiency could go the opposite direction and add more capacitance.

Heat limited structures will actually get slower if packed more densely. Sure, there are markets like mobile with a lot of dark silicon and the traditional shrink can help - although even they need to figure out how to do a lot more AI compute.

It is not just SRAM that is stalling, a lot of compute is going to stall, too. If we really want to look ahead, we need a roadmap for the energy parameters, and we need process designers to put more emphasis on energy not density.

If you have a solution that can be assembled by the hundreds of billions and yield and ramp from 0 to hundreds of million we have a job for you.

You are stating the obvious!
Love the armchair QBs

Tanj · Mar 5, 2024

TBiggs said:
If you have a solution that can be assembled by the hundreds of billions and yield and ramp from 0 to hundreds of million we have a job for you.

I use the processes. The people with the jobs of making the processes might want to think about the feedback.

TBiggs said:
You are stating the obvious!
Love the armchair QBs

Great that you think energy per operation is obvious - so you would agree that numbers showing actual progress would be useful, right?

No one publishes them today, no press or analyst is paying attention. But right now, hot chips like H100 which everyone agrees are pivotal do not actually run at their spec speeds for more than a brief burst because they run too hot. The users are aware of this. Shrinking the process does not help if the energy parameters are not improved. So it is obvious to users, but seems not so much to anyone else.

Search

the measure of a new process

Tanj

Well-known member

TBiggs

Active member

Tanj

Well-known member

TBiggs

Active member

Tanj

Well-known member