TSMC officially halts SRAM scaling

Paul2 · Dec 15, 2022

Mooredaddy said:
Surely everyone saw this coming right? This is a key reason why chiplets are such a good idea. Offload SRAM which has been scaling horribly for years now. Can even use a cheaper process on the SRAM. Monolithic dies are an endangered species (my opinion).

They will be here for cost sensitive chips for a very long time. A friend of mine worked on an idea to do cheap, cheap die stacking solution for MCUs which didn't involve TSVs exactly to allow them to use off-chip SRAM/eFLASH, which got ridiculously expensive on old nodes because of the need to run WiFi/Bluetooth stacks in software.

Nothing came out of it, not because of tech infeasibility, but lack of interest.

Few MBs cache is ok on <16nm, but are disasters on even not so old legacy nodes.
Similarly, eFLASH taking space on expensive nodes are disasters.

The expense of co-packaged SRAM is not so big for MCU makers yet, but is for CPU makers which nevertheless have much fatter profit margins. I am puzzled.

I guess the off-die L3 of AMD is a test for much, much bigger SRAMs, rather than them trying to reduce cost of current L3 sizes.

Tanj · Dec 16, 2022

Fred Chen said:
A separate SRAM would be an interesting return to the past days of the Pentium Pro: https://en.wikipedia.org/wiki/Pentium_Pro#Caching

Vertical integration to get the bandwidth on AMD (and yes, they have said it will be on Zen 4 as well, just not first chips out). Intel has "Rambo cache". At L3 level arguably your SRAM does not need to be competing with the space for logic on your most expensive node, when you can add a nanosecond and halve the price 2 nodes behind.

If you look at the authors and designs for the TSMC papers on SRAM you can see they have a specialist group making a test chip for each node. If you use EDA to generate an SRAM block you will not see anything like that density on your logic chip. AMD may have gone with optimized-for-SRAM process on their Z-cache, which seems denser than the cache on the CPU chips which directly align to it.

I have seen TSMC using an Arm A72 as a benchmark for the N3 node variants for size, clock speed, and power, which makes sense. Not sure what mix of cache sizes was synthesized for the comparisons.

Mooredaddy · Dec 16, 2022

Paul2 said:
They will be here for cost sensitive chips for a very long time. A friend of mine worked on an idea to do cheap, cheap die stacking solution for MCUs which didn't involve TSVs exactly to allow them to use off-chip SRAM/eFLASH, which got ridiculously expensive on old nodes because of the need to run WiFi/Bluetooth stacks in software.

Nothing came out of it, not because of tech infeasibility, but lack of interest.

Few MBs cache is ok on <16nm, but are disasters on even not so old legacy nodes.
Similarly, eFLASH taking space on expensive nodes are disasters.

The expense of co-packaged SRAM is not so big for MCU makers yet, but is for CPU makers which nevertheless have much fatter profit margins. I am puzzled.

I guess the off-die L3 of AMD is a test for much, much bigger SRAMs, rather than them trying to reduce cost of current L3 sizes.

I guess I should have specified for high performance chips. Monolithic dies will absolutely continue to exist for products not needing to feed increasingly cache hungry logic on the forefront of computing. But for said bleeding edge I just don’t see how you could sufficiently add enough sram/cache without getting in the realm of massively oversized dies and the horrifically uneconomic yielding issues that entails.

laurchar · Dec 17, 2022

Forksheet and CFET allow closer NMOS and PMOS, so there are still room to reduce the SRAM point.

Fred Chen · Dec 18, 2022

laurchar said:
Forksheet and CFET allow closer NMOS and PMOS, so there are still room to reduce the SRAM point.

These are based on the GAA nanosheet FETs. There are already signs this architecture has some scaling difficulty due to the S/D vertical resistance.

https://intechopen.com/chapters/73506 There are Ids differences noted between top and bottom nanosheets.

CFET is supposed to go further than Forksheet, with vertical not lateral n-p, but how to contact the lower transistor? Probably should be a buried line/contact. Otherwise, gate pitch will suffer.

benb · Dec 18, 2022

The implication of Finfet SRAM non-scaling is huge. SRAMs are just regular transistors, no capacitor. The SRAM test chip is what is used to describe MMP and CPP and tracks, meaning it is descriptive of the entire node, and generalizes to the entire node. So the inescapable conclusion is that Finfet scaling has hit a wall. A wall we all knew was there, physics, our old friend.

This helps explain why Samsung went with GAA. Maybe if you want dense SRAMs, you need to be talking 3nm with Samsung. Memory is the main business there after all.

Search

TSMC officially halts SRAM scaling

Paul2

Well-known member

Tanj

Well-known member

Mooredaddy

Well-known member

laurchar

New member

Fred Chen

Moderator

benb

Well-known member