They talk a good game.
www.linkedin.com
Meet Sohu, the fastest AI chip of all time.
With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs.
Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance.
One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server.
We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners.
We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more.
We’re on track for one of the fastest chip launches in history:
- Top hardware engineers and AI researchers have left every major AI chip project to join us.
- We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production.
- Our early customers have reserved tens of millions of dollars of our hardware.
Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per… | Etched | 112 comments
Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our...
Meet Sohu, the fastest AI chip of all time.
With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs.
Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance.
One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server.
We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners.
We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more.
We’re on track for one of the fastest chip launches in history:
- Top hardware engineers and AI researchers have left every major AI chip project to join us.
- We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production.
- Our early customers have reserved tens of millions of dollars of our hardware.