ETCHED AI

milesgehm · Jun 27, 2024

They talk a good game.

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per… | Etched | 112 comments

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our...

www.linkedin.com

Meet Sohu, the fastest AI chip of all time.

With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs.

Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance.

One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server.

We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners.

We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more.

We’re on track for one of the fastest chip launches in history:
- Top hardware engineers and AI researchers have left every major AI chip project to join us.
- We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production.
- Our early customers have reserved tens of millions of dollars of our hardware.

Paul2 · Jun 27, 2024

Going with something low volume to TSMC is a bad deal. I am not a specialist in linear algebra logic, but my gut feeling is that if their advanced algorithm A is 100x more efficient than algorithm B running on a general purpose chip, why would they need the most advanced litho node at a huge premium?

They can make a smaller chip, on older node, at a less premium fab, and still make tons of money, if that chip really makes money.

The whole of "AI industry" got bitcoin like air around it. Assume fraud by default.

Jozo035 · Jun 27, 2024

Only reason would be chip-to-chip bottleneck. But yes, lot of red flags.

hist78 · Jun 27, 2024

milesgehm said:
They talk a good game.

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per… | Etched | 112 comments

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our...

www.linkedin.com

Meet Sohu, the fastest AI chip of all time.

With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs.

Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance.

One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server.

We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners.

We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more.

We’re on track for one of the fastest chip launches in history:
- Top hardware engineers and AI researchers have left every major AI chip project to join us.
- We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production.
- Our early customers have reserved tens of millions of dollars of our hardware.

"We’re on track for one of the fastest chip launches in history:
- Top hardware engineers and AI researchers have left every major AI chip project to join us.
- We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production.
- Our early customers have reserved tens of millions of dollars of our hardware."

WOW

but

horace · Jul 2, 2024

Paul2 said:
Going with something low volume to TSMC is a bad deal. I am not a specialist in linear algebra logic, but my gut feeling is that if their advanced algorithm A is 100x more efficient than algorithm B running on a general purpose chip, why would they need the most advanced litho node at a huge premium?

They can make a smaller chip, on older node, at a less premium fab, and still make tons of money, if that chip really makes money.

The whole of "AI industry" got bitcoin like air around it. Assume fraud by default.

// In reality, ASICs are orders of magnitude faster than GPUs. When bitcoin miners hit the market in 2014, it became cheaper to throw out GPUs than to use them to mine bitcoin.

Etched actually mentioned bitcoin ASIC on their website. Bitmain was a viable company for a few years, many people associated with it cashed out and retired. If Bitcoin is the precedent, we still have a few more years of party time.

milesgehm · Jul 13, 2024

Interview with founder is interesting. There is a transcript

‎Invest Like the Best with Patrick O'Shaughnessy: Gavin Uberti - Real-Time AI & The Future of AI Hardware on Apple Podcasts

‎Show Invest Like the Best with Patrick O'Shaughnessy, Ep Gavin Uberti - Real-Time AI & The Future of AI Hardware - Dec 12, 2023

podcasts.apple.com

KevinK · Jul 15, 2024

There have been numerous claims from new and innovative AI chip/system companies over the past 4-5 years. The real question nowadays is whether they can build commercial traction, which really means delivering complete chips, systems and software for training models and/or for complete multi-model inference applications. One of the major challenges is that getting to the “exemplary benchmark results” step is far from sufficient. I found it interesting that AMD bought Silo.ai, a 300+ person AI software solutions and infrastructure company, to stay in the game as the table stakes for entering the Gen AI applications platform market keep moving higher - that’s a significant addition of software people on the Gen AI applications implementation side, and they wouldn’t have made that investment unless it was critical to driving AI revenues. Was also interesting to see Graphcore, which once was making similar claims to Etched (10x A100 in speed in 2021) and was once valued at 2.7B and took in 700M in funding, decide to sell to SoftBank for 400-500M. On the flip side, it appears that Cerebras has figured out how to deliver a value-added Gen AI and HPC platform with several hundred million in revenue.

Search

ETCHED AI

milesgehm

Active member

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per… | Etched | 112 comments

Paul2

Well-known member

Jozo035

Active member

hist78

Well-known member

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per… | Etched | 112 comments

horace

Member

milesgehm

Active member

‎Invest Like the Best with Patrick O'Shaughnessy: Gavin Uberti - Real-Time AI & The Future of AI Hardware on Apple Podcasts

KevinK

Well-known member