Nodelets/halfnodes take to their zenith: A new process development model

nghanayem · Dec 15, 2022

The other day I had a stimulating conversation with a colleague, and wanted to let the fine folks on semiwiki in on our little thought experiment. Before getting into the new process development scheme, first some background:

Across the 2000s and most of the 2010s intel pursued what they termed as "hyperscaling". This scheme allowed intel to catapult to unquestioned density and performance leadership, as well as adopting all major architectural changes that appeared during these years 3-5 years ahead of the competition. But as with many things "...it worked great until it didn't.". The foundries adopted a more conservative (and most importantly consistent) roadmap that allowed the foundries narrow and eventually surpass intel. TSMC are the masters of this development model, with new nodes having minimal high-risk items. TSMC is also famous for their yearly nodelets to plug the gap in between new nodes. These nodelets bring greater P&P, and when lithography or DTCO opportunities present themselves TSMC merges these changes to give substantial area benefits to their customers.

The TSMC model has proven so effective that intel has abandoned it's hyperscaling scheme for parallel development of more conservatively targeted nodes. They are also introducing nodelets on a 2-4 quarter cadence to provide extra libraries and performance improvements. In what seems like recent intel fashion these performance improvements are full node level upticks in P&P and minimal density improvements (due to delays 14nm and 10nm were forced to have similar massive performance kickers). Samsung does a similar thing with their nodes but to an even more extreme degree. Since the release of 7LPP Samsung has added new denser and new higher performance libraries, SDB, as well as sizable optical shrinks. 5LPP (and hopefully soon 4LPP) are greatly upgraded in capability over their 7LPP stablemate.

With this information and the recent stumbling of TSMC trying to implement a 1.7x density improvement for N3, a new potential development model wormed its way into my head. In this scheme semiconductor firms would try to capture the best parts of consistent execution and pushing forward daring transistor architectures. This scheme would see a return to 3 years (or more likely 4 or 5 years with the current slow down of scaling) between new "main" nodes. These nodes would try to swing for the fences to compensate for the long time between them. As for the solution for what design folks would use between these new nodes; well they would use even more beefed up nodelets that are designed in parallel with the main node. The idea being to blur the line between what is a "nodelet" and what is a "full node". To illustrate how this scheme would work, let us take a hypothetical cycle starting at P1276 (marketing name: i4/3).

In our hypothetical P1276, pitches would be larger than the current i4, and there would be no SDB. The goal for this node would likely be somewhere between N5 and 5LPP. This initial version might then be followed with a version that introduces more EUV layers and pitches similar to i4. These changes would be enabled by the prior generation increasing firm's EUV proficiency/maturity. The next version could then implement SDB. Iteration after that might add a BSPDN with no density improvement. Your final iteration could then offer a further shrink/increase to EUV usage. By the time P1278 would come out it would no longer be this huge jump, rather it might be only a slightly bigger jump than what your yearly P1276 iterations were giving. The P1276 team would then move onto either P1280 or 1282 (depending on if there are two or three teams working in parallel). The innovations for P1276 would partially be powered by the co-development of the new more aggressive P1278. Technological synergies between the two programs like EUV, SDB, and BSPDN would derisk these process modules, and they would only be merged into the older node as it becomes safe to do so. Over the course of this new P1278 node, high-NA might then become production ready and merged for the upcoming year's 1278 iteration. While you were at it you could even give P1276 a swansong final iteration. This iteration would use high-na to give one last shrink, or move the BEOL to single patterning in an effort to simplify design rules.

Me and my colleague saw three main issues with this development model. Even with the co-development/derisking of certain process modules and the additional fudge factor that would be allocated to this "big jump" node; the "big jump" node will still likely be hard to predict. This model would also likely need far more engineers and scientist than the current "TSMC model" requires. Finally designers might not bother with many of these halfstep nodes (N7+ being basically unused by all but Apple comes to mind here). Regardless of the idea's practicality, it was a fun discussion, and I thought the folks here might enjoy the thought experiment.

M. Y. Zuo · Dec 15, 2022

An interesting analogy would be the automotive industry where vehicle models typically refresh on much longer timescales of 6 to 8 years. Usually with a substantive mid-life "facelift" along the way as well as small changes for each new model year.

But the underlying chassis is almost never touched until the next refresh, with the exception of performance tuned sporty versions. The luxury carmakers sometimes also introduce limited run or 'fully custom' '1 of 1' models based on an existing chassis for those willing to pay.

Paul2 · Dec 15, 2022

M. Y. Zuo said:
An interesting analogy would be the automotive industry where vehicle models typically refresh on much longer timescales of 6 to 8 years. Usually with a substantive mid-life "facelift" along the way as well as small changes for each new model year.

But the underlying chassis is almost never touched until the next refresh, with the exception of performance tuned sporty versions. The luxury carmakers sometimes also introduce limited run or 'fully custom' '1 of 1' models based on an existing chassis for those willing to pay.

Indeed, jet engines follow the same cadence. GE, and RR both offer quiet "refresh nodes" for willing to pay. I guess the difference is that in aircraft industry, you don't have much freedom to opt to postpone the move as old engine model can be pulled off the market than semi nodes.

cliff · Dec 16, 2022

Mr. Ng: Nice topic.

What is the goal of your fab at this point?

Help the CPU/FPGA divisions beat AMD?
IFS low die cost market?
IFS high volume customers only?
IFS high performance market?

Is Intel willing to go back to improving older nodes? Does IFS service just want to leave the DUV processes as-is?

Would your strategy change depending on these answers?

nghanayem · Dec 16, 2022

The idea behind this strategy was to create a consistent execution engine that can bring newer device and beol improvements to market at a rapid pace.

Large firms (Apple/intel) would see the largest benefit since they could easily target their products for each year’s node.

Cost for this development model would likely be prohibitive for the foundry itself and any customers who want to use the newest node. This is because changes implemented on a year to year basis are large enough that yields would definitely drop significantly from the prior iteration. By the time it recovers the next node would be ready. Another issue would be buying and injecting new tools into the process to power the new iterations (increasing the depreciation phase of a node because it would in a way always be ramping). Maybe cost sensitive chips can using the first iteration after the second comes out and waiting for the second iteration’s replacement before moving to the 2nd iteration.

As for improving older processes, process engineers always work to improve cost and yield for nodes running at their factories. As for adding new features, I feel that this model would be too busy with N, N-1, and maybe a final version of N+1 to worry about even older nodes. Which I suppose is fine since most of TSMC Samsung and intel’s nodes are left alone after they are replaced (with only select nodes getting specialty technology).

IanD · Dec 16, 2022

nghanayem said:
The idea behind this strategy was to create a consistent execution engine that can bring newer device and beol improvements to market at a rapid pace.

Large firms (Apple/intel) would see the largest benefit since they could easily target their products for each year’s node.

Cost for this development model would likely be prohibitive for the foundry using it and any customers who want to use the newest node. This is because changes implemented on a year to year basis are large enough that yields would definitely drop significantly from the prior iteration. By the time it recovers the next node would be ready. Another issue would be buying and injecting new tools into the process to power the new iterations (increasing the depreciation phase of a node because it would in a way always be ramping).

As for improving older processes, process engineers always work to improve cost and yield for nodes running at their factories. As for adding new features, I feel that this model would be too busy with N, N-1, and maybe a final version of N+1 to worry about even older nodes. Which I suppose is fine since most of TSMC Samsung and intel’s nodes are left alone after they are replaced (with only select nodes getting specialty technology).

You've hit the nail on the head there, and it's exactly the same reason foundries won't retrofit BSPDN into older nodes -- it makes no business sense (actually, negative...) so they won't do it... ;-)

TSMCs current process strategy works extremely well for them, once a process is rolled out and stable there are just minor tweaks during its lifetime which can be taken advantage of without reworking all the IP and libraries, they can be used as-is or with a small optical shrink (like 7nm==>6nm or 5nm==>4nm). New IP can be laid out using new rules (e.g. denser layouts, SDB instead of DDB) but crucially the old layouts can be left untouched.

You can see what happens when this compatibility is lost, which is what happened with N7+ when TSMC introduced some EUV layers to pipeclean the tools -- N7 layouts were not compatible with N7+ rules, and because EUV restrictions were very different to DUV (things like line end spacing) there wasn't even an easy way to port IP. The consequence was that N7+ had almost no customers.

Paul2 · Dec 16, 2022

nghanayem said:
Large firms (Apple/intel) would see the largest benefit since they could easily target their products for each year’s node.

And here is the problem which been often whispered about lately: After Apple, and other big clients are gone, where do you get clients from to pick the freed capacity on an extremely expensive legacy node?

It was already visible with woefully large cap at first wet nodes, and for <20nm you see just how few companies managed to move there from planar on TSMC. The few who did are all F500 brands.

This is why TSM is uncharacteristically pushy trying to move people from 40nm down. They are trying to fill huge capacities at 7nm-20nm after huge top clients moved away from them.

Big market change from the previous decade: Chinese cookie cutter SoC makers are no longer there to fill the cap any more. The market for cheap synthesized chips from outsourcing sweatshops doesn't exist at <20nm.

IanD · Dec 16, 2022

Paul2 said:
And here is the problem which been often whispered about lately: After Apple, and other big clients are gone, where do you get clients from to pick the freed capacity on an extremely expensive legacy node?

It was already visible with woefully large cap at first wet nodes, and for <20nm you see just how few companies managed to move there from planar on TSMC. The few who did are all F500 brands.

I don't think they're all F500 brands, we certainly weren't. The point is that you need a big enough business built on a chip to pay back the development costs, and the best way to do this with advanced processes is to sell an end product, not a chip, and definitely not IP -- as a rule-of-thumb the unit value of IP:chip:end product is something like 1:10:100, so you need to sell an awful lot of chips -- 10x as many as products -- to recoup the NRE, and this is very difficult in advanced nodes except for a small number of products, because you need at least several hundred million dollars of revenue. And well-nigh impossible with IP unless it's something that lots of customers will pay millions of dollars for like a high-speed SERDES...

Of course as time goes on even the advanced EUV nodes get cheaper (wafer cost and NRE), but they're never going to get down to the cost of (for example) 12nm today.

nghanayem · Dec 16, 2022

Yeah we didn’t think the idea was super feasible. The closest you can probably get is SS 7nm family, or the 15-20% P&P improvements from intel performance kickers. A bit of a shame because on paper you get the best of the TSMC model and hyperscaling. However you end up with either low capacity nodes or low utilization fabs that won’t allow you to get alot of yield learning before the new better thing comes to market. I do wonder what could be done (that is realistic) to blur the line between node and nodelet, or if the current model is as close as we can realistically get?

cliff · Dec 16, 2022

Mr. Ng, please continue your analysis. It is worth doing and fun to consider. I would like to hear it. I think your analysis has to include what Mr. Gunslinger is trying to achieve. His strategy is likely to change as his opponents (TSMC and Samsung) is moving their chess pieces perhaps in a different than expected manner.

nghanayem · Dec 16, 2022

As for what we all know from intel's announcements and actions: Intel is moving to a similar low risk model to TSMC for process development. Nodes are being conservatively designed in parallel now, with contingency plans, and new features validated on platforms that are already known quantities. The age of aiming for every node to have greater than 2x density improvements are over. It seems we are still getting sizable performance kickers from intel nodes. What is uncertain is if new kickers will come out after the successor makes it’s way to market.

In addition to a focus on cadence and performance per watt, it seems intel has a strong focus on decreasing design complexity and wafer costs. My evidence for this is that intel 4-18A have euv single patterning for the BEOL, and intel trying to be a first mover on high/hyper-na. Intel 16 being their main duv foundry offering rather than 14 or 10nm also meshes well with this focus on minimizing design complexity and wafer cost.

Paul2 · Dec 16, 2022

Bad news for TSM is that brand new 20nm-28nm capacity will be coming to the market circa 2023-2024.

This is capacity built up on the wave of "chippagedon" fears, and 3rd tier fabs reaching 28nm. UMC was said to have laid up a multiple of its current capacity for 28nm expansion, and that will be very big.

UMC will benefit enormously if SMIC will get hit more hard by US sanctions, as they are a huge legacy process player.

UMC vs. SMIC struggle is way more interesting to watch for me than rather predictable TSMC vs. Intel vs. Samsung one.

cliff · Dec 16, 2022

The single patterning is done on metal 5 and above, correct?

Daniel Nenni · Dec 16, 2022

On a historical note, Apple came to TSMC at 20nm after doing the initial Ax SoCs at Samsung. The 20nm based iPhone 6 was wildly successful but the transition from 20nm to 16nm (FinFETs) did not go as planned. 20nm added double patterning which was a challenge then came the 3D transistors. Apple split the iPhone6s between Samsung 14nm and TSMC N16 due to yield issues. The rest of the iPhones have been TSMC only using a new process version every year. Apple has what we call a most favored nation agreement with TSMC. Apple is first and gets the best pricing. The first TSMC process versions such as N3 will be Apple only specifically tuned to Apple devices, foundation IP and IP blocks. Density and low power are the key. TSMC N3E is for the masses.

If you look at the Apple Ax SoCs they go from Samsung 90nm to TSMC N4 with N3 coming out next year. The thing to notice is the die size and transistor count. This has a lot to do with design as well but it is incredible what Apple and TSMC have done with their exclusive relationship. Neither Apple nor TSMC would be the companies they are today if not for that partnership, absolutely.

A12 - TSMC N7 6.9 billion transistors @ 83.27 mm2
A15 - TSMC N5 15 billion transistors @ 107.68 mm2

Apple wanted a new process version every year so the TSMC half node methodology began which was a critical turning point in semiconductor process development. Yes it is a cautious path focused on yield learning but as history has shown it is much better to speak softly and carry a big stick than shout from the roof tops empty handed.

nghanayem · Dec 16, 2022

N28 and i16 have metal pitches that allow for all metals to be single patterned.

As for your point Dan, I totally agree. The proof is in the pudding. While going all in on one aggressive/innovative node one after another worked for a time, the process became unwieldy and inconsistent. In the end the tortoise has beaten the hare and intel turned a 3-5 year advantage into a 2 year disadvantage. I think the shift to a model more similar to TSMC is the right move.

Where our idea originally grew out of was the questions “Can intel out TSMC TSMC?”, and “Would it be possible to make nodes conservative enough and half nodes large enough that the line between the two begins to blur?”. From the second question the problem came up where will the half node innovations come from, how would you develop these innovations every year, and how would you add major architectural changes if the main nodes are more scaled back than the current TSMC model? This is where the idea of having aggressive long incubation time moonshot nodes where mature process modules are backported to the current node came about. In theory this approach would give an Apple an even better chip every year than the current model. As I previously stated though their are great capital costs and potential risks that would likely make this model inferior to the current approach. Even ignoring costs, there is also the question on if this model could even deliver greater PPA then the current model.

Daniel Nenni · Dec 16, 2022

Here is why Intel Foundry or Samsung Foundry will never catch TSMC in practicality. TSMC has a massive ecosystem of collaborative customers, partners, and suppliers that will never be replicated. We are talking about many trillions of dollars in combined R&D to get TSMC where it is today. Never bet against the power of the semiconductor ecosystem because you will lose.

Yes Intel was first to FinFET but TSMC won the final FinFET battle. Yes Samsung was first to GAA but that battle is far from over. TSMC is the worlds most trusted foundry for a reason, delivering the wafers you need when you need them. Nobody has a better track record than TSMC in this regard and that is critical for the top foundry customers. Does TSMC need to be first to press release? No. Does TSMC need to have the best Power Point PPA? No. TSMC just needs to deliver the best customer designed chips and that is what the semiconductor ecosystem is all about.

nghanayem · Dec 16, 2022

For sure. This model wasn't so much conceived as "Intel's master plan to be the world's number one foundry". Rather it was a thought exercise on "If TSMC, Samsung, or Intel could somehow have the best of the hyperscalling and TSMC node development schemes?". The question of ecosystems is the duty of the foundry departments of those companies' businesses and is an independent issue from those firm's fabs competing in the "Moore's law" rat race. Of course the two can influence each other but I feel that is beyond the scope of this thought experiment. For example GF has been making excellent strides in building up a strong, compelling foundry ecosystem. For many years now UMC has had a compelling foundry ecosystem. However nobody would say that GF or UMC are "pushing Moore's law forward".

Mooredaddy · Dec 16, 2022

nghanayem said:
As for what we all know from intel's announcements and actions: Intel is moving to a similar low risk model to TSMC for process development. Nodes are being conservatively designed in parallel now, with contingency plans, and new features validated on platforms that are already known quantities. The age of aiming for every node to have greater than 2x density improvements are over. It seems we are still getting sizable performance kickers from intel nodes. What is uncertain is if new kickers will come out after the successor makes it’s way to market.

In addition to a focus on cadence and performance per watt, it seems intel has a strong focus on decreasing design complexity and wafer costs. My evidence for this is that intel 4-18A have euv single patterning for the BEOL, and intel trying to be a first mover on high/hyper-na. Intel 16 being their main duv foundry offering rather than 14 or 10nm also meshes well with this focus on minimizing design complexity and wafer cost.

My understanding is Intel getting high NA first is misleading in its significance at best.

nghanayem · Dec 16, 2022

Mooredaddy said:
My understanding is Intel getting high NA first is misleading in its significance at best.

My understanding is a bit of column A and a bit of column B. The reticle size is much smaller but with the move to disaggregated architectures/mobile SOCs already being tiny this isn’t the biggest deal in the world (although it does hurt). The resolution issues can and likely will eventually be ironed out(it has to unless we want EUV quad patterning). As for deliveries of course TSMC and SS will get their first units right after intel. However if intel is assisting ASML in getting across the finish line for high/hyper-na intel will likely be high priority in line for new systems (this exact situation happened with EUV). ASML also has a vested interest in intel’s success. If TSMC becomes the only leading edge logic company then ASML’s long term profits will suffer.

Back to the point at hand I mentioned high-na for two reasons. First single or double patterning high-na allows intel to ramp their EUV capacity faster and help narrow the EUV gap to SS and TSMC at a faster rate. Second for whatever reason it seems intel wants to prioritize BEOL single patterning over density. To me that signals that intel wants high/hyper-na asap to either avoid or minimize the use of SALELE. My guess is that intel sees this QOL feature as beneficial to IFS given how limited their ecosystem is right now. During their 18A paper they literally said that customers were asking for single patterning.

Mooredaddy · Dec 16, 2022

nghanayem said:
My understanding is a bit of column A and a bit of column B. The reticle size is much smaller but with the move to disaggregated architectures/mobile SOCs already being tiny this isn’t the biggest deal in the world (although it does hurt). The resolution issues can and likely will eventually be ironed out(it has to unless we want EUV quad patterning). As for deliveries of course TSMC and SS will get their first units right after intel. However if intel is assisting ASML in getting across the finish line for high/hyper-na intel will likely be high priority in line for new systems (this exact situation happened with EUV). ASML also has a vested interest in intel’s success. If TSMC becomes the only leading edge logic company then ASML’s long term profits will suffer.

Back to the point at hand I mentioned high-na for two reasons. First single or double patterning high-na allows intel to ramp their EUV capacity faster and help narrow the EUV gap to SS and TSMC at a faster rate. Second for whatever reason it seems intel wants to prioritize BEOL single patterning over density. To me that signals that intel wants high/hyper-na asap to either avoid or minimize the use of SALELE. My guess is that intel sees this QOL feature as beneficial to IFS given how limited their ecosystem is right now. During their 18A paper they literally said that customers were asking for single patterning.

I agree fully. My main point however was that TSMC really wasn’t getting my High NA EUV much later the Intel. I believe TSMC is getting it in 2024. As per this article. Pat has made it out like Intel is getting some huge jump on TSMC when that narrative is misleading at best, especially considering TSMC has much more experience with mass production in using base EUV. If I’m wrong here please correct me. But this is what I’ve heard. https://www.taipeitimes.com/News/biz/archives/2022/06/18/2003780053

Nodelets/halfnodes take to their zenith: A new process development model

Well-known member

Active member

Well-known member

Active member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Active member

Well-known member

Well-known member

Active member

Admin

Well-known member

Admin

Well-known member

Well-known member

Well-known member

Well-known member