When did engineers realize scaling below 90nm wasn't like prior scaling? (i.e. Dennard scaling ending)

Xebec · Aug 8, 2022

I'm just curious - when did the engineers designing higher performance / leading edge chips like the Pentium 4 actually realize that "Dennard scaling was breaking down" ?

I know it was common knowledge by the early 2000s that increasing transistor counts were going to drive up power consumption, but I assume a lot of the design decisions made were based on the 'free lunch' aspect of new nodes bringing higher frequencies at lower voltages than before.. Was it predictable that this would slow down substantially with the 90nm node?

usersegment · Aug 8, 2022

From what I've seen some sharp individuals saw it coming in the year 2000, so likely with the 0.18u generation.

Specifically I will point to a presentation given by Bob (Robert) Colwell at Stanford in February of 2004 when intel was still pushing hard on the 'netburst' architecture that they had originally hoped/planned would take them to something like 10 or 20 GHz. At this point in time Prescott had just launched, and Tejas was supposedly still going to push further down that line (though would be cancelled definitively a little over two months later along with Jayhawk).

Abstract/Bio

Bob Colwell mentioned during this talk that one of the reasons he left intel (in the year 2000) after being chief architect on the P4 was in part because he had been told to execute on 'higher clock speeds' for the follow-up designs which were entirely dependent on Dennard scaling not being dead. For that reason and thermal density he claims he saw the writing on the wall, declined to do it and that was part of why he left Intel.

Xebec · Aug 8, 2022

Very interesting - I remember Intel presentations arond the launch of Pentium 4 talking about thermal density being a real future problem. (Though interesting how 20 years later, chips are pushing even more power - 200-300W in even smaller dies with no problem)..

How would Bob/others have noticed/anticipated the breakdown of Dennard Scaling in 2000 - was it starting to reduce in benefit around the 180nm node or was there some known/understood physics principle that was going to cause it to stop?

benb · Aug 8, 2022

There is no one thing that drives scaling, which means the door is open for more of it, pretty much always. The story of semiconductor process is finding small improvements to keep the ball rolling forward.
Copper metallization replacing aluminum had an impact. Low-K dielectrics as well. High-K dielectrics in the last nodes of planar transistors. Metal gates. Lithography resolution enhancements, off-axis illumination, immersion, and multi-exposure all contributed. Strained silicon.
We’re in a slow-down period of semiconductor process improvement. There have been few big innovations since EUV. So the real question isn’t why Dennard scaling ended but why was it accepted and nothing done about it? Why is there no EUV replacement?

usersegment · Aug 8, 2022

One of the themes of Bob's talk from 2004 was that 'you can't make up for the lack of an underlying exponential'. He argued that the industry had been chasing clock speed (time to solution) at the expense of complexity on every other dimension so vigorously that something was going to fundamentally hit a wall and break. He particularly argued that thermal density (that is watts per unit area) was going to hit a wall soon. Was he right? If we stayed on that typical power curve we'd have moved ~two orders of magnitude between 2004 and 2020 like we did between 1990 and 2004, and we would have 8kW cpus in our mid towers now. Collectively we had no choice but to stop riding that dragon, even if it was responsible for most of the performance gains we enjoyed during the dotcom era. To my knowledge the exponential that used to help us with cost per transistor has reversed trend since and now works against us, and the only beneficial one we still have on our side is the decline in energy per bit operation. Someone more knowledgeable please educate me, but my impression is that the energy per bit has gone linear on recent processes and that some of the 'gains' are coming from adding 'everything accelerators' that are specialized to the tasks?

Paul2 · Aug 9, 2022

usersegment said:
One of the themes of Bob's talk from 2004 was that 'you can't make up for the lack of an underlying exponential'. He argued that the industry had been chasing clock speed (time to solution) at the expense of complexity on every other dimension so vigorously that something was going to fundamentally hit a wall and break. He particularly argued that thermal density (that is watts per unit area) was going to hit a wall soon. Was he right? If we stayed on that typical power curve we'd have moved ~two orders of magnitude between 2004 and 2020 like we did between 1990 and 2004, and we would have 8kW cpus in our mid towers now. Collectively we had no choice but to stop riding that dragon, even if it was responsible for most of the performance gains we enjoyed during the dotcom era. To my knowledge the exponential that used to help us with cost per transistor has reversed trend since and now works against us, and the only beneficial one we still have on our side is the decline in energy per bit operation. Someone more knowledgeable please educate me, but my impression is that the energy per bit has gone linear on recent processes and that some of the 'gains' are coming from adding 'everything accelerators' that are specialized to the tasks?

The "wall" at 100W/cm² is still there without a doubt. Individual dies push to 110-150W/cm² with very thorough thermal engineering.

Xebec · Aug 9, 2022

Paul2 said:
The "wall" at 100W/cm² is still there without a doubt. Individual dies push to 110-150W/cm² with very thorough thermal engineering.

Is that wall in reference to logic or overall chip die size in general? At least on the desktop we have chips like the 12900K that have the “big” cores on the size order of ~ 13mm2 while being able to consume up to 30W each in certain workloads, at stock speeds and voltages. Thanks!

Xebec · Aug 9, 2022

benb said:
There is no one thing that drives scaling, which means the door is open for more of it, pretty much always. The story of semiconductor process is finding small improvements to keep the ball rolling forward.
Copper metallization replacing aluminum had an impact. Low-K dielectrics as well. High-K dielectrics in the last nodes of planar transistors. Metal gates. Lithography resolution enhancements, off-axis illumination, immersion, and multi-exposure all contributed. Strained silicon.
We’re in a slow-down period of semiconductor process improvement. There have been few big innovations since EUV. So the real question isn’t why Dennard scaling ended but why was it accepted and nothing done about it? Why is there no EUV replacement?

Very interesting. Hope this makes sense - but is it that Dennard Scaling was sort of an emergent phenomenon based on all of these factors then, rather than something singular like the actual transistor size itself? Thanks!

nghanayem · Aug 9, 2022

Xebec said:
Very interesting - I remember Intel presentations arond the launch of Pentium 4 talking about thermal density being a real future problem. (Though interesting how 20 years later, chips are pushing even more power - 200-300W in even smaller dies with no problem)..

How would Bob/others have noticed/anticipated the breakdown of Dennard Scaling in 2000 - was it starting to reduce in benefit around the 180nm node or was there some known/understood physics principle that was going to cause it to stop?

Some observations I've noticed
1. Modern CPUs have bigger die sizes prescot is almost half the size of modern 12th gen cpus. This comparison definitely falls apart when you consider that so much of these modern chips is likely off at any given time (reducing "effective" die size for the modern processors)
2. Cooling solutions are far more elaborate than they once were
3. In laptop power consumption is way down from what it was back then (and speaking subjectively they preform closer to their desktop kin than they did back in the early 2000s)
4. In desktop people have just come to accept higher power consumption. A good example of this would be Bulldozer with it's 125W TDPs, which were deemed too hot, when today this is acceptable/normal for midrange processors from both amd and intel.

nghanayem · Aug 9, 2022

benb said:
There is no one thing that drives scaling, which means the door is open for more of it, pretty much always. The story of semiconductor process is finding small improvements to keep the ball rolling forward.
Copper metallization replacing aluminum had an impact. Low-K dielectrics as well. High-K dielectrics in the last nodes of planar transistors. Metal gates. Lithography resolution enhancements, off-axis illumination, immersion, and multi-exposure all contributed. Strained silicon.
We’re in a slow-down period of semiconductor process improvement. There have been few big innovations since EUV. So the real question isn’t why Dennard scaling ended but why was it accepted and nothing done about it? Why is there no EUV replacement?

To be frank, there is no DUV replacement right now. EUV is being used right now because there is no other reasonable choice. I don't know the exact numbers since litho is not my field of expertise, but my understanding is that the throughput is still far behind what it is for DUV. I could be wrong, but if DUV double patterning was still useable for things like metal 0 than I doubt EUV would be used (as opposed to other technologies ala finfet where once you could do it you did it, as opposed to only doing it when you have exhausted all other alternatives).

Daniel Nenni · Aug 9, 2022

Have you guys read Wally's book?

https://semiwiki.com/books/Predicting%20Trends%20PDF%20Version%20Final.pdf

It is worth the effort, absolutely. He covers the curves....

robbi165 · Aug 10, 2022

Device engineers likely understood impending issues with Dennard scaling from the beginning, they likely just hoped a solution would be developed at the right time.

The biggest issue is that in a MOSFET there is a fundamental minimum value for the subthreshold slope at room temperature that sets a limit on how much voltage is required to switch on a transistor. In the early days, supply voltages were so large that the subthreshold slope limitation was insignificant. However, as channel lengths became smaller and voltages dropped, eventually it became a limiting factor. Because of this, at smaller nodes, only small voltage decreases are possible, thus only small power density improvement is possible, limiting the ability to increase frequency at the same power.

There have been a number of new devices proposed to overcome this limitation but none have panned out so far and it remains to be seen if they ever will.

SPQR54 · Aug 27, 2022

Guys, back to the basics please.
Device engineers knew all along that Dennard scaling was ending, it was designers that wouldn't listen.
Dennard formula, based on the physics of the basic CMOS structure, was indicating clearly what was being gained by reducing the size of the devices. Its beauty is that the physical factors limiting the scaling were always there to be seen: 1) below a certain thickness the gate insulator gets conductive (partially solved by moving from silicon dioxide to Hf based oxides but still a wall is there); 2) there is a limit on the necessary increasing doping when reducing transistor size leading to issues in channel control; 3) the unit power dissipated per unit area is a constant.
To compensate other phenomena in the device, strain was introduced. FinFET also relaxed the condition that increasing the width of the transistor to increase the current was impacting the area of the device. These are all related to the physics of the device and its materials. At that point the factors indicated by robbi165 started to play.
The solution overall was to move to metal gates/high-k gate structures, confined structures like FinFET and SOI and now nanowires/nanosheets. That was the point, somewhere between 65nm and 45nm, where true Dennard scaling ended. The structures since thern are not anymore the basic CMOS and so the performance gain factor stopped to be constant and started reducing from the 100% at each node to the current 20-25%.

Then you have the processing challenges, but they have nothing to do with Dennard scaling issues. For example for a while there seemed to be a barrier at 1 micron, well before the limits of Dennard scaling. The topography of the structures were conflicting with the depth of focus of the litho but the introduction of chemiomechanical polishing (CMP) saved the day.

Someone of the oldies around this forum should get together and write the in depth device and processing history complementing Daniel book

benb · Aug 27, 2022

SPQR54 said:
Guys, back to the basics please.
Device engineers knew all along that Dennard scaling was ending, it was designers that wouldn't listen.
Dennard formula, based on the physics of the basic CMOS structure, was indicating clearly what was being gained by reducing the size of the devices. Its beauty is that the physical factors limiting the scaling were always there to be seen: 1) below a certain thickness the gate insulator gets conductive (partially solved by moving from silicon dioxide to Hf based oxides but still a wall is there); 2) there is a limit on the necessary increasing doping when reducing transistor size leading to issues in channel control; 3) the unit power dissipated per unit area is a constant.
To compensate other phenomena in the device, strain was introduced. FinFET also relaxed the condition that increasing the width of the transistor to increase the current was impacting the area of the device. These are all related to the physics of the device and its materials. At that point the factors indicated by robbi165 started to play.
The solution overall was to move to metal gates/high-k gate structures, confined structures like FinFET and SOI and now nanowires/nanosheets. That was the point, somewhere between 65nm and 45nm, where true Dennard scaling ended. The structures since thern are not anymore the basic CMOS and so the performance gain factor stopped to be constant and started reducing from the 100% at each node to the current 20-25%.

Then you have the processing challenges, but they have nothing to do with Dennard scaling issues. For example for a while there seemed to be a barrier at 1 micron, well before the limits of Dennard scaling. The topography of the structures were conflicting with the depth of focus of the litho but the introduction of chemiomechanical polishing (CMP) saved the day.

Someone of the oldies around this forum should get together and write the in depth device and processing history complementing Daniel book

Well you have to draw the line somewhere, why not HKMG (45nm)
Or channel strain (SiGe) at 90nm
But, is polysilicon really a true CMOS gate material? You could draw the line in the 1980s when poly replaced W.

LOL

SPQR54 · Aug 27, 2022

Exactly. I am drawing the line at 45nm, as I said, when a .7x reduction of size stopped for gate oxide thickness and introduction of high-k/metal gate to get a factor 2 reduction in surface and the at the same time a factor 2 in the other parameters was required. Strain at 90nm did not involve any dimensional change so no breakdown in Dennard formula. Same take for poly vs tungsten, better for a number of reasons but not required to maintain Dennard formula valid.

By the way W was already an improvement on Al and let's not starts on benefits of STI vs LOCOS ....

And a word of hommage for Robert H Dennard, who I met only once in the 90s but who made a big impression on a younger me. On top of the scaling law he was at the origin of the 1T DRAM for those who are unfamiliar.

Fred Chen · Aug 27, 2022

nghanayem said:
To be frank, there is no DUV replacement right now. EUV is being used right now because there is no other reasonable choice. I don't know the exact numbers since litho is not my field of expertise, but my understanding is that the throughput is still far behind what it is for DUV. I could be wrong, but if DUV double patterning was still useable for things like metal 0 than I doubt EUV would be used (as opposed to other technologies ala finfet where once you could do it you did it, as opposed to only doing it when you have exhausted all other alternatives).

If you are one of the few companies who have already invested in EUV equipment, you will find uses. But otherwise, you would re-use the existing DUV equipment, e.g., https://www.coventor.com/paper/self...n5-metal-2-self-aligned-quadruple-patterning/

Fred Chen · Aug 27, 2022

Mark Bohr of Intel had a nice paper on this from a little over 15 years ago: https://www.eng.auburn.edu/~agrawvd/COURSE/READING/LOWP/Boh07.pdf

jms_embedded · Aug 28, 2022

Fred Chen said:
Mark Bohr of Intel had a nice paper on this from a little over 15 years ago: https://www.eng.auburn.edu/~agrawvd/COURSE/READING/LOWP/Boh07.pdf

better copy straight from the source: https://sscs.ieee.org/images/files/newsletter_archive/sscs_newsletter_200701.pdf
(aside from the figure graininess

)

Xebec · Aug 28, 2022

Fred Chen said:
Mark Bohr of Intel had a nice paper on this from a little over 15 years ago: https://www.eng.auburn.edu/~agrawvd/COURSE/READING/LOWP/Boh07.pdf

I get the impression from all of the comments here and the articles linked that the chips that came out and 'hit the wall' (or were cancelled) due to lack of Dennard scaling were caused by a number of things: Designers/Engineers expecting continued "classic" voltage/frequency scaling even when there was clear knowledge it wouldn't likely continue, Marketing and business models expecting bigger numbers, and a general culture that lived by the motto of doubling transistors even more frequently than node scaling gave density to keep performance improving.

(The "new node every 3 years, but also double die size every 3 years" also in the 1980s to yield 2x every 18 months text was interesting.)

Thanks everybody!

I'm curious is there any 'new formula' since ~ 65nm/45nm that says every ~3-4 equivalent nodes today yield the same voltage/frequency scaling of 1 old node jump?

Tanj · Aug 30, 2022

SPQR54 said:
Guys, back to the basics please.
Device engineers knew all along that Dennard scaling was ending, it was designers that wouldn't listen.
Dennard formula, based on the physics of the basic CMOS structure, was indicating clearly what was being gained by reducing the size of the devices. Its beauty is that the physical factors limiting the scaling were always there to be seen: 1) below a certain thickness the gate insulator gets conductive (partially solved by moving from silicon dioxide to Hf based oxides but still a wall is there); 2) there is a limit on the necessary increasing doping when reducing transistor size leading to issues in channel control; 3) the unit power dissipated per unit area is a constant.
To compensate other phenomena in the device, strain was introduced. FinFET also relaxed the condition that increasing the width of the transistor to increase the current was impacting the area of the device. These are all related to the physics of the device and its materials. At that point the factors indicated by robbi165 started to play.
The solution overall was to move to metal gates/high-k gate structures, confined structures like FinFET and SOI and now nanowires/nanosheets. That was the point, somewhere between 65nm and 45nm, where true Dennard scaling ended. The structures since thern are not anymore the basic CMOS and so the performance gain factor stopped to be constant and started reducing from the 100% at each node to the current 20-25%.

Then you have the processing challenges, but they have nothing to do with Dennard scaling issues. For example for a while there seemed to be a barrier at 1 micron, well before the limits of Dennard scaling. The topography of the structures were conflicting with the depth of focus of the litho but the introduction of chemiomechanical polishing (CMP) saved the day.

Someone of the oldies around this forum should get together and write the in depth device and processing history complementing Daniel book

Actually, Dennard scaling predates Dennard writing about it. Carver Mead published an analysis of scaling in the mid 60's which pretty much nailed scaling possibilities out to the FET limit, which he expected in the 90nm range. Dennard and he were acquainted at the time. Dennard's predictions were the version most people noticed.

Yes, a history of Moore's Law (which was economics, not physics) and all the things along the way which kept it going would be a great project for us all. There have been numerous changes along the way: integration with SiO2 surfaces, change from bipolar to NMOS to CMOS, Dennard scaling of FETs, multilayer metal, chemical-mechanical planarization and the explosion of both device structures and metal levels that allowed, damascene metal, strained channels, new gate dielectrics, tuning the gate bandgap, SOI, full depletion, fins (which you can see both as improved channel control and multiplying the available surface area), stacked ribbons (same improvements again), buried rails, backside power and wiring, etc. And that is just the list for standard logic. There are other technologies to apply to memory (another Dennard contribution!), to EDA, to packaging, to storage, to acceleration, to coherency, ... economics drives innovation and a trillion-dollar industry has a lot of moving parts with synergies.

Dennard was simply the most well known and long-lived of the boosters on the Moore rocket. But Moore is well beyond economical escape velocity now which means that improvements can and will be applied without drag.

When did engineers realize scaling below 90nm wasn't like prior scaling? (i.e. Dennard scaling ending)

Well-known member

New member

Well-known member

Well-known member

New member

Well-known member

Well-known member

Well-known member

Banned

Banned

Admin

New member

Member

Well-known member

Member

Moderator

Moderator

Well-known member

Well-known member

Well-known member