Will the Cerebras four trillion transistor chip change the AI game?

Arthur Hanson · Oct 27, 2025

Cerebras maintaines having everything on on giant chip saves power and provides superior results to a board full of separate chips. Any thoughts or comments on this appreciated.

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4 Trillion Transistors - Cerebras

Third Generation 5nm Wafer Scale Engine (WSE-3) Powers Industry’s Most Scalable AI Supercomputers, Up To 256 exaFLOPs via 2048 Nodes

www.cerebras.ai

Cerebras Raises $1.1 Billion at $8.1 Billion Valuation

Cerebras is the go-to platform for fast and effortless AI training. Learn more at cerebras.ai.

www.cerebras.ai

Daniel Payne · Oct 27, 2025

Cerebras is trying to build it's own niche in the AI infrastructure world, yet they just pulled their IPO plans, so that's a troubling sign.

https://www.cnbc.com/2025/10/03/cerebras-withdraws-ipo-ai.html

KevinK · Oct 27, 2025

Arthur Hanson said:
Cerebras maintaines having everything on on giant chip saves power and provides superior results to a board full of separate chips. Any thoughts or comments on this appreciated.

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4 Trillion Transistors - Cerebras

Third Generation 5nm Wafer Scale Engine (WSE-3) Powers Industry’s Most Scalable AI Supercomputers, Up To 256 exaFLOPs via 2048 Nodes

www.cerebras.ai

Cerebras Raises $1.1 Billion at $8.1 Billion Valuation

Cerebras is the go-to platform for fast and effortless AI training. Learn more at cerebras.ai.

www.cerebras.ai

In my books, Cerebras is the real deal - wafer scale integration gives them some fundamental strengths that translate into huge wins when it comes to hosting large scale Gen AI models:
* model speed - wafer-scale on-chip memory delivers huge memory bandwidth and thus, super speedy models within a single slot / cabinet.
* lower power - wafer-scale also offers incredible on-chip interconnect density. Far shorter signaling wires means lower power consumption within a slot / cabinet.

But those advantages might be less clear when looking at large scale multi-user performance, capacity and cost/TCO, especially when going beyond a single slot /chassis, and looking at rack-level and multi-rack results. In that world, there is much more dependence on how well the hardware and software shares the compute resources within the entire rack across many contexts and users. From what I can tell, things that DeepSeek and others have done related to disaggregation across hardware and KV caching, greatly increase hardware capacity and speed for heterogeneous hardware - AI cpus plus HBMs. That’s why I find the rack-level LLM benchmarking in the second half of this article so interesting - rack-level TCO per processor, and power per token is far better at the rack level than slot level, thanks to these kinds of rack-level optimizations.

InferenceMAX™: Open Source Inference Benchmarking

NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS 120B, Llama3 70B

newsletter.semianalysis.com

DanX · Oct 27, 2025

The only way is to sell itself to google.

Search

Will the Cerebras four trillion transistor chip change the AI game?

Arthur Hanson

Well-known member

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4 Trillion Transistors - Cerebras

Cerebras Raises $1.1 Billion at $8.1 Billion Valuation

Daniel Payne

Moderator

KevinK

Well-known member

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4 Trillion Transistors - Cerebras

Cerebras Raises $1.1 Billion at $8.1 Billion Valuation

InferenceMAX™: Open Source Inference Benchmarking

DanX

Active member