You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
12FFC has 1.2x the density and 25% less power / 10% higher perf. over 16FF+, but this is the first time I've heard of 12FFN. What do we know about this 16FF+ optimization?
Since NVDA is already on 16nm moving to 12nm is not a big challenge. From what I have heard NVDA will skip 10nm and move directly to 7nm HPC, which they co developed with TSMC.
In terms of die size and transistor count, NVIDIA is genuinely building the biggest GPU they can get away with: 21.1 billion transistors, at a massive 815mm2, built on TSMC’s still green 12nm “FFN” process (the ‘n’ stands for NVIDIA; it’s a customized higher perf version of 12nm for NVIDIA).
I just got through the NVIDIA collateral and let me say, WOW, what an incredible run they are having. The question I have is who will catch them? Certainly not AMD or Intel.
I am wondering about the inference stuff. NVDA seems to be chasing that market with GPUs but it seems like overkill to me. FPGAs seem much more suited. I do understand the single vendor approach to AI, one vendor for both learning and inference but on a chip vs chip level FPGAs seem much better suited. Thoughts?
AI is a target market for TSMC so yes they will work closely with NVDA. You should also know that NVDA has been working closely with TSMC since the beginning of both companies 20+ years ago. In fact, [FONT=Roboto, arial, sans-serif]Jensen Huang and Morris Chang are very close friends. [/FONT]
I just got through the NVIDIA collateral and let me say, WOW, what an incredible run they are having. The question I have is who will catch them? Certainly not AMD or Intel.
I am wondering about the inference stuff. NVDA seems to be chasing that market with GPUs but it seems like overkill to me. FPGAs seem much more suited. I do understand the single vendor approach to AI, one vendor for both learning and inference but on a chip vs chip level FPGAs seem much better suited. Thoughts?
Agreed on inference. The trend is to skinnying down neural nets in inference for much lower power and area - one to 4 bit multiplication and sparse-matrix handling for example. That motivates more specialized hardware/IP, potentially even on high-volume applications. FPGAs? Maybe though power remains a concern for edge nodes.
As I've said before, GPU commuting is ideal for algorithms that can benefit from parallelism, and inference and NNs certainly do. FPGA is highly suitable for algorithms that benefit from reprogramability, think search.
I was just at Eurocrypt conference. The limiting step in breaking RSA public key encryption is large matrix multiplications. Matrix multiplication is not parallizable and best for sparce matrices is n squared (number of rows times number of columns). Matrices are huge. I wonder what special hardware for sparse matrix multiplication is. Best for non sparse matrices is n**2.8.
Also interesting is work on creating password algorithms that inherently require password breaking algorithms (exhaustive search) to require maximum memory and maximum non parallizable compute time so cracking ASICs can't be built.
I was just at Eurocrypt conference. The limiting step in breaking RSA public key encryption is large matrix multiplications. Matrix multiplication is not parallizable and best for sparce matrices is n squared (number of rows times number of columns). Matrices are huge. I wonder what special hardware for sparse matrix multiplication is. Best for non sparse matrices is n**2.8.
...
Jensen said the 815 mm[SUP]2[/SUP] size chip was "reticle limited". Working backward from the photos, I'm guessing the mask is around 10cm by 13cm, so indeed you can only fit one on a 6 inch square reticle. Do you people think that will be a trend for processors? What does that mean for yields?
Jensen said the 815 mm[SUP]2[/SUP] size chip was "reticle limited". Working backward from the photos, I'm guessing the mask is around 10cm by 13cm, so indeed you can only fit one on a 6 inch square reticle. Do you people think that will be a trend for processors? What does that mean for yields?
Nothing good. Max dice per wafer ~60, yield below 50%. So most likely 30 good chips per wafer at max. Of course, the prime dice are sold for thousands of dollars, so the profit is still there.
Just wondering, 12nm is 16nm with 6 track cell instead of 7.5 or 9. But I've heard that reducing the track height also reduces the performance and yet 12nm is 10% better performance. So, how do they do it? Presumably, they've changed something else apart from the track height?
Just wondering, 12nm is 16nm with 6 track cell instead of 7.5 or 9. But I've heard that reducing the track height also reduces the performance and yet 12nm is 10% better performance. So, how do they do it? Presumably, they've changed something else apart from the track height?
That's 12FFC, the mobile node. We do not know the specs of 12FFN. It's possible that 12FFN still uses 7.5 track cells, because if you compare the transistor densities of GP100 and GV100, there's hardly any difference.
That's 12FFC, the mobile node. We do not know the specs of 12FFN. It's possible that 12FFN still uses 7.5 track cells, because if you compare the transistor densities of GP100 and GV100, there's hardly any difference.
Nothing good. Max dice per wafer ~60, yield below 50%. So most likely 30 good chips per wafer at max. Of course, the prime dice are sold for thousands of dollars, so the profit is still there.
12FFN wafer pricing isn't public, but for a die this big the manufacturing cost per good chip could easily be a couple of thousand dollars. I heard for a similar size chip in a different process in the past that a "good wafer" was one which yielded *a* working die -- hopefully 12FFN is better than this...