You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!
The weak point of AI hardware is supplying enough memory bandwidth to feed the cores and it seems like they aren't using HBM memory. How can it beat Nvidia without HBM?
That's what it reads like. The most successful instruction set emulator in software I'm aware of is Apple's Rosetta (now Rosetta 2) for running X86 code on its M-series processors.
Your post made me go back and read about Transmeta... I forgot it was yet another VLIW design that lost to superscalar architecture. Transmeta's lead founder, Dave Ditzel, worked at Intel for a while, apparently trying to do a follow-on generation to the Transmeta design. Apparently nothing came of it.
That's what it reads like. The most successful instruction set emulator in software I'm aware of is Apple's Rosetta (now Rosetta 2) for running X86 code on its M-series processors.
Your post made me go back and read about Transmeta... I forgot it was yet another VLIW design that lost to superscalar architecture. Transmeta's lead founder, Dave Ditzel, worked at Intel for a while, apparently trying to do a follow-on generation to the Transmeta design. Apparently nothing came of it.
In my view, any company serious about doing LLM acceleration hardware has to be talking about circuit improvements through architectural innovation and optimization for full-stack attention / transformer-based inference - not “we make faster, parallel universal processor that now does FP4.
Thought this talk was eye-opening on the kinds of hardware / software challenges that are the bottlenecks today along with possible solutions.