You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
the SIMD is mostly don't getting used for AI it's the Matmul hardware or the tensor core part that is getting used. TPU is just a large matmul accelerator (gross simplification)
You're correct that Blackwells, for example, aren't the (relatively!) simple systolic arrays that GPUs used to be, but SIMD is used actually or conceptually (single instruction stream on multiple threads) in various execution units and dataflows in the chips. I will argue that Nvidia's experience with SIMD concepts and implementations has given them a leg up over companies that have never implemented and productized SIMD or SIMD-like architectures before. The same goes for AMD. And Google has seven generations of getting systolic arrays right in TPUs. I know that you know that theory is easy compared to making an implementation function reliably, with high performance and high efficiency, especially in silicon gates.
You're correct that Blackwells, for example, aren't the (relatively!) simple systolic arrays that GPUs used to be, but SIMD is used actually or conceptually (single instruction stream on multiple threads) in various execution units and dataflows in the chips. I will argue that Nvidia's experience with SIMD concepts and implementations has given them a leg up over companies that have never implemented and productized SIMD or SIMD-like architectures before. The same goes for AMD. And Google has seven generations of getting systolic arrays right in TPUs. I know that you know that theory is easy compared to making an implementation function reliably, with high performance and high efficiency, especially in silicon gates.
AMD has been shipping client GPUs for quite some generation so they have the experience. The issue with AMD is not hardware but software but their Hardware from only hardware point of view is very good but the software and tooling is not mature,. AMD was able to come back in CPU due to the fact that hardware was compatible with already mature X86_64 Ecosystem if somehow you can insert AMD hardware and run CUDA and stuff on it Nvidia would be in a tight spot as for your last sentence I wholeheartedly agree.
Hate to say it because Intel is so decimated already, but you may get your wish. Another round of layoffs coming I think, kicked off by Jack Dorsey. That middle layer is red meat.
Hate to say it because Intel is so decimated already, but you may get your wish. Another round of layoffs coming I think, kicked off by Jack Dorsey. That middle layer is red meat.