Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/the-beginning-of-the-end-of-nvidia-and-nobody-is-paying-attention.25306/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/EmailDomainReplace] => 1000010
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2031070
            [XFI] => 1060170
        )

    [wordpress] => /var/www/html
)

The beginning of the end of NVIDIA? And nobody is paying attention.

Daniel Nenni

Founder
Staff member
Is NVIDIA the new IBM? In 1970s, a cheap $25 microprocessor running open software took down monolithic mainframe giants. Today, history is repeating itself. Jim Keller’s open-source RISC-V Wormhole architecture is stripping the power away from proprietary hardware and handing it back to software. The era of the GPU monopoly might be closer to its end than anyone realizes.

"If you want to know the future, look at the past."​

Albert Einstein

Article content

Intel 4004
The single biggest revolution to completely change the computer industry began in 1971 with the introduction of the first microprocessor, the Intel 4004. Before it, tech giants like IBM, DEC, and Cray ruled the world. Their computers were monolithic beasts built out of massive, expensive, hardwired electronic circuitry running proprietary, specialized software. Specialized hardware was king. They were the NVIDIAs of their era.



Article content

Gary Kildall created the world's first operating system for microprocessors, CP/M, altering the course of computing history forever.

The tiny Intel 4004 CPU, built for a lowly calculator, shattered that entire industry. It led to Gary Kildall inventing CP/M, the world’s first operating system for microprocessors. Kildall also invented the BIOS, a software abstraction layer for hardware, and pioneered the open architecture that allowed generic software to replace specialized hardware circuits and closed mainframe code. Kildall's architecture and operating system set the exact blueprint for Microsoft's and IBM's commercialization of the modern PC.


Article content

Steve Jobs and Steve Wozniak using Chuck Peddle's $25 MOS 6502 CPU to take on the entire corporate establishment and, tech titan IBM

This shift from specialized hardware to software led to cheap microprocessors like the MOS 6502, sparking the home computer revolution. Computers were suddenly in the hands of common people, not just corporations. At the time, it was unfathomable that anyone could beat a tech giant like IBM (the NVIDIA of its day). Yet, a cheap, toy $25 CPU running open software, in the hands of a tinkering Steve Wozniak, brought those empires to their knees.

History is repeating right now:​


  • RISC-V is the new MOS 6502
  • NVIDIA is the new IBM
NVIDIA's specialized, hardwired GPUs are vulnerable to the next software revolution, just like IBM's 1970 mainframes. They are massively complex, proprietary electronics with power-hungry hardware cache controllers, and they require highly specialized, non-portable software: CUDA. They are also fiercely expensive, mimicking the mainframes of yesteryear.

Article content

Risc-V is the new $25 MOS 6502

Enter stage right: Legendary chip designer Jim Keller's Wormhole architecture and RISC-V CPUs. This is the new Intel 4004 and MOS 6502, the beginning foundation of what could change eventually everything.

With Wormhole, there are no complex hardwired electronics or automated hardware caches like a modern GPU. Instead, there is a grid of tiny RISC-V CPUs and dedicated FPUs. Instead of rigid hardware circuitry, a software compiler choreographs the dataflow across the grid on the fly, giving it the flexibility of an FPGA entirely through software.

The mic drop? It's all open source and completely programmable through general software. Just like in the 1970s, when CPUs running general software and open operating systems ended up replacing highly specialized mainframe hardware, software-defined architectures are poised to win again.

"I didn't know enough to know it was impossible." - Steve Wozniak

Back in 1971, the dominant tech giants were deemed invincible. But unknown oddballs playing with what were considered cheap toys in garages started a software revolution that ended the tech empires of that time.

Possibly , the next Kildalls, Wozniaks are hidden somewhere now, envisioning a future that others see impossible, going against the bloat of mainstream trends.

As Alan Kay famously said: "The computer revolution hasn't happened yet."​


Article content

"Look back over the past, with its changing empires that rose and fell, and you can foresee the future, too." - Marcus Aurelius

If history repeats, the tech juggernaut of proprietary, hardwired GPU dominance will only be a chapter in a history book, barely remembered by the next generation. Who would've thought in 1971 that a simple Intel 4004 CPU for a small calculator would lead to the ultimate dethroning of tech giants like IBM, DEC, and Cray?

...Fate, it seems, is not without a sense of irony...that RISC-V targets plain old languages like C, C++ and open dataflow frameworks, not proprietary lock ins like CUDA

but that is another story for another time.

 
Is NVIDIA the new IBM? In 1970s, a cheap $25 microprocessor running open software took down monolithic mainframe giants. Today, history is repeating itself. Jim Keller’s open-source RISC-V Wormhole architecture is stripping the power away from proprietary hardware and handing it back to software. The era of the GPU monopoly might be closer to its end than anyone realizes.

"If you want to know the future, look at the past."​

Albert Einstein

Article content

Intel 4004
The single biggest revolution to completely change the computer industry began in 1971 with the introduction of the first microprocessor, the Intel 4004. Before it, tech giants like IBM, DEC, and Cray ruled the world. Their computers were monolithic beasts built out of massive, expensive, hardwired electronic circuitry running proprietary, specialized software. Specialized hardware was king. They were the NVIDIAs of their era.



Article content

Gary Kildall created the world's first operating system for microprocessors, CP/M, altering the course of computing history forever.

The tiny Intel 4004 CPU, built for a lowly calculator, shattered that entire industry. It led to Gary Kildall inventing CP/M, the world’s first operating system for microprocessors. Kildall also invented the BIOS, a software abstraction layer for hardware, and pioneered the open architecture that allowed generic software to replace specialized hardware circuits and closed mainframe code. Kildall's architecture and operating system set the exact blueprint for Microsoft's and IBM's commercialization of the modern PC.


Article content

Steve Jobs and Steve Wozniak using Chuck Peddle's $25 MOS 6502 CPU to take on the entire corporate establishment and, tech titan IBM

This shift from specialized hardware to software led to cheap microprocessors like the MOS 6502, sparking the home computer revolution. Computers were suddenly in the hands of common people, not just corporations. At the time, it was unfathomable that anyone could beat a tech giant like IBM (the NVIDIA of its day). Yet, a cheap, toy $25 CPU running open software, in the hands of a tinkering Steve Wozniak, brought those empires to their knees.

History is repeating right now:​


  • RISC-V is the new MOS 6502
  • NVIDIA is the new IBM
NVIDIA's specialized, hardwired GPUs are vulnerable to the next software revolution, just like IBM's 1970 mainframes. They are massively complex, proprietary electronics with power-hungry hardware cache controllers, and they require highly specialized, non-portable software: CUDA. They are also fiercely expensive, mimicking the mainframes of yesteryear.

Article content

Risc-V is the new $25 MOS 6502

Enter stage right: Legendary chip designer Jim Keller's Wormhole architecture and RISC-V CPUs. This is the new Intel 4004 and MOS 6502, the beginning foundation of what could change eventually everything.

With Wormhole, there are no complex hardwired electronics or automated hardware caches like a modern GPU. Instead, there is a grid of tiny RISC-V CPUs and dedicated FPUs. Instead of rigid hardware circuitry, a software compiler choreographs the dataflow across the grid on the fly, giving it the flexibility of an FPGA entirely through software.

The mic drop? It's all open source and completely programmable through general software. Just like in the 1970s, when CPUs running general software and open operating systems ended up replacing highly specialized mainframe hardware, software-defined architectures are poised to win again.

"I didn't know enough to know it was impossible." - Steve Wozniak

Back in 1971, the dominant tech giants were deemed invincible. But unknown oddballs playing with what were considered cheap toys in garages started a software revolution that ended the tech empires of that time.

Possibly , the next Kildalls, Wozniaks are hidden somewhere now, envisioning a future that others see impossible, going against the bloat of mainstream trends.

As Alan Kay famously said: "The computer revolution hasn't happened yet."​


Article content

"Look back over the past, with its changing empires that rose and fell, and you can foresee the future, too." - Marcus Aurelius

If history repeats, the tech juggernaut of proprietary, hardwired GPU dominance will only be a chapter in a history book, barely remembered by the next generation. Who would've thought in 1971 that a simple Intel 4004 CPU for a small calculator would lead to the ultimate dethroning of tech giants like IBM, DEC, and Cray?

...Fate, it seems, is not without a sense of irony...that RISC-V targets plain old languages like C, C++ and open dataflow frameworks, not proprietary lock ins like CUDA

but that is another story for another time.


From Google:

While Tenstorrent's Wormhole architecture introduces highly innovative ideas—like decentralizing workloads and treating a chip cluster as a single large network—it comes with notable weaknesses, trade-offs, and scaling disadvantages compared to traditional GPUs like NVIDIA's. [1]

## 1. The HBM vs. GDDR6 Bottleneck

To keep hardware manufacturing costs low, Wormhole uses standard graphics memory (GDDR6) instead of ultra-premium High Bandwidth Memory (HBM). [2, 3]

*
* The Disadvantage: While GDDR6 is highly affordable, it offers drastically lower raw memory bandwidth than HBM3 or HBM3e.

* The Impact: Massive LLMs (like Llama 3 70B) are intensely memory-bandwidth bound. Because Wormhole cannot feed data to its compute engines as fast as an HBM-equipped NVIDIA H100 or H200, it faces significant performance bottlenecks during large-scale model inference. [2]
*

## 2. Software Maturity & The "CUDA" Gap

NVIDIA's dominance is heavily protected by its mature CUDA ecosystem, which has been optimized by developers for nearly two decades.

*
* The Disadvantage: Tenstorrent's software stack (like TT-Metalium and TT-NN) is relatively young. Academic benchmarks have noted that while the hardware interconnects are blazing fast, multi-device orchestration layers still experience software-driven performance overhead. [4, 5]

* The Impact: It is rarely a "plug-and-play" experience. Getting top-tier performance out of Wormhole requires compilation optimization, and many mainstream machine learning models require manual tuning to run effectively. [4, 6]
*

## 3. High Programming Complexity (The Bare-Metal Burden)

Wormhole's dataflow architecture forces a completely different way of thinking about code.

*
* The Disadvantage: To extract maximum efficiency from a Tensix core, developers must write three separate micro-kernels per core: a reader kernel (to pull data via NoC), a compute kernel (for the math), and a writer kernel (to push data out). These must be manually synced using circular memory buffers in the local SRAM. [7]

* The Impact: This introduces a steep learning curve. While Tenstorrent provides high-level Python wrappers, optimizing performance for custom or cutting-edge AI models requires elite, bare-metal C++ programming skills. [8, 9]
*

## 4. Fragmented Local On-Chip Memory

Wormhole relies heavily on keeping data inside the tiny 1.5 MB SRAM pools attached to each individual Tensix core to avoid going to the slower GDDR6 memory. [7]

*
* The Disadvantage: If an AI workload or matrix layer cannot be neatly diced up into tiny 32 × 32 mathematical tiles that fit inside that 1.5 MB window, the chip must constantly swap data back and forth.

* The Impact: For complex, non-standard neural networks or heavy legacy HPC workloads, managing this fragmented memory manually creates massive developer headaches and can tank execution speeds. [7]
*

## 5. Rapid Generational Obsolescence

Wormhole was built on an older 12nm process node. Because Tenstorrent moves fast, Wormhole is already facing an internal successor. The newer Blackhole architecture (built on a much denser 4nm process) fixes many of Wormhole's raw compute limits and stands as Tenstorrent's flagship focus for standalone computing. This means Wormhole is quickly shifting into a developer/experimental budget platform rather than a cutting-edge production standard. [1, 4, 10, 11]

------------------------------

[1] [https://www.spheron.network](https://www.spheron.network/blog/tenstorrent-vs-nvidia-open-source-ai-hardware/)
[2] [https://www.reddit.com](https://www.reddit.com/r/hardware/comments/aidw7l/hbm_vs_gddr6/)
[3] [https://www.reddit.com](https://www.reddit.com/r/Amd/comments/bbqld9/gddr6_hbm2_tradeoffs/)
[4] [https://www.youtube.com](https://www.youtube.com/watch?v=6NzriALRnz8&t=2)
[5] [https://arxiv.org](https://arxiv.org/html/2605.02744v1)
[6] [https://www.youtube.com](https://www.youtube.com/watch?v=6IZEo5XmHxM&t=3)
[7] [https://github.com](https://github.com/tenstorrent/tt-metal/blob/main/METALIUM_GUIDE.md)
[8] [https://www.eetimes.com](https://www.eetimes.com/tenstorrent-engineers-talk-open-sourced-bare-metal-stack/)
[9] [https://clehaxze.tw](https://clehaxze.tw/gemlog/2025/04-21-programming-tensotrrent-processors.gmi)
[10] [https://www.youtube.com](https://www.youtube.com/watch?v=Fa2GS86_cx8&t=10)
[11] [https://www.youtube.com](h
ttps://www.youtube.com/watch?v=Id3enIOAY2Q&t=47)
 
The question for me is which workload are they targeting ? The fastest $$ growth area right now is data center scale inference and agentic AI, seemingly requires a bunch of specialized hardware optimizations for token production and agent dispatch and execution, all tightly integrated. The fastest, most efficient token production requires FP4 (NVFP4, MXFP4) math, which Tenstorrent has seemingly removed from Wormhole and Blackhole (it was there in Grayskull).

The real measure will be how well their big-iron does on the new generation of data center AI agent benchmarks from Artificial Analysis and SemiAnalysis.

 
Last edited:
This is a thoughtful article, but there's a giant difference between 1970s IBM and today's Nvidia: the culture. 1970s IBM was the apex of beaurcracy, meanwhile Nvidia under Jensen Haung is still very nimble given it's size. (The IBM PC in 1981 was a great example of IBM breaking it's own culture to do something innovative).

Another big difference is consumer expectations. In the 1970s, there were NO end-user expectations of performance or use-cases for having a home computer. That meant, relatively modest CPU efforts like the 6502 could succeed (and still be sold in the millions more than a decade later). The initial competition for the early 8-bit micros was .. pen and paper, not the mainframes.

I applaud what Jim Keller is doing, but I think this is a misread on the effect of his products on Nvidia.
 
I think it's apples-oranges: IBM was never really on top of the semiconductor industry the way NEC, Motorola and later Intel were. Nvidia clearly is on top today.

What comes next is a good thing to think about though. You can see the progression from fab-heavy to fab light to fabless. So based on that path of travel, the next dominant chip company will be even more capital-light than fabless? Like an IP company? ARM perhaps?
 
The question for me is which workload are they targeting ? The fastest $$ growth area right now is data center scale inference and agentic AI, seemingly requires a bunch of specialized hardware optimizations for token production and agent dispatch and execution, all tightly integrated. The fastest, most efficient token production requires FP4 (NVFP4, MXFP4) math, which Tenstorrent has seemingly removed from Wormhole and Blackhole (it was there in Grayskull).

The real measure will be how well their big-iron does on the new generation of data center AI agent benchmarks from Artificial Analysis and SemiAnalysis.

They have 4x 800G ports on Blackhole card (silicon supports 10x 400G). I am surprised They are not selling it as DPU. I hope next generations will go in that direction. Call it AI-native infrastructure for enterprise or something like that.
 
What comes next is a good thing to think about though. You can see the progression from fab-heavy to fab light to fabless. So based on that path of travel, the next dominant chip company will be even more capital-light than fabless? Like an IP company? ARM perhaps?
The trajectory of leading edge silicon volume drivers had been PCs (400M/year), which then moved to smartphones (1.2B/yr). But now the money is flowing to data center inference which pulls through a whole range of leading edge silicon - CPU servers, GPU/AI accelerators, high-end connectivity/switching, as well as huge volumes of memory and packaging. And like the IBM mainframe days, the demand and greatest efficiency for the "smartest models" seems to lie on the server / data center side, at least for a while. We might see a world in the future where the core AI bounces back to the client side, but that will likely rely on stabilization of "good enough" AI agentic systems that can live on the client side for select "killer apps".
 
Back
Top