Intel Says It Won’t Compete With NVIDIA In AI Market, Shifts Focus Towards Bringing Cost-Effective AI Solutions With Gaudi 3

fansink · Oct 18, 2024

It will be interesting to see how the AI situation pans out for Intel moving into the future since the firm is in desperate need of a lifeline given the financial conditions they are going through.

Intel has given up the AI race with NVIDIA & is now in pursuit of launching cost-effective solutions with its new Gaudi 3 AI offerings.

Intel's Gaudi 3 AI GPUs Will Feature Industry-Leading Performance-Per-Dollar Value, But Won't Compete With NVIDIA As Blue Team Shifts Target Market
Well, it seems like Team Blue has finally realized that competing in the "compute power" race with NVIDIA isn't the viable path for them to have a sustainable business.

Instead, the firm is now tapping into a relatively unpopulated AI business segment, which is a cost-efficient implementation of AI accelerators, which is likely going to amuse a larger part of the industry. Based on a report by CRN, Intel is pitching its newest Gaudi 3 AI GPUs as a valuable offering, coming in with some of the best price-to-performance ratios available in the markets.

Nanduri said while Gaudi 3 is “not catching up” to Nvidia’s latest GPU from a head-to-head performance perspective, the accelerator chip is well-suited to enable economical systems for running task-based models and open-source models on behalf of enterprises, which is where the company has “traditional strengths.”

Intel claims that its Gaudi 3 lineup offers performance equivalent to NVIDIA's popular H100 AI accelerator, particularly in inferencing workloads, which have seen a massive rise following the debut of "reasoning-focused" LLM models. In terms of the actual figures, Intel claims that the Gaudi 3 AI accelerator provides 80% better performance-per-dollar value when compared with NVIDIA's H100, and when benchmarking at Llama-2, the performance-per-dollar difference goes up by 2x, which is indeed impressive.

The firm is solely pitching the new AI lineup as the best solution for small-scale startups and individuals looking to acquire AI computing power. However, when tested in floating-point operations, the Gaudi 3 AI GPUs do feel short of NVIDIA's alternatives, suggesting that hardcore AI performance isn't Intel's cup of tea for now.

Team Blue has realized that they can't compete with NVIDIA when it comes to hardware dominance. Surprisingly, the firm claims that they aren't looking towards capitalizing on demand from the mainstream market players. In the longer run, they believe that smaller LLM models will see wider adoption once the AI frenzy and the craze behind large-scale data centers fade away.

The world we are starting to see is people are questioning the [return on investment], the cost, the power and everything else. This is where—I don’t have a crystal ball—but the way we think about it is, do you want one giant model that knows it all.

We feel like where we are with the product, the customers that are engaged, the problems we're solving, that's our swim lane. The bet is that the market will open up in that space, and there'll be a bunch of people building their own inferencing solutions.

- Intel's Anil Nanduri via CRN

Intel's Gaudi 3 AI solutions have witnessed decent adoption from the industry, mainly by IBM Cloud, Hewlett Packard Enterprise, and even Dell in their respective data center products for the markets. We can't really say that Gaudi 3 isn't seeing the market spotlight, but Team Blue's bets on the AI markets haven't worked out too well, with previously the company's CEO Pat Gelsinger calling NVIDIA's CUDA as a moat, but it turned out to be a dominant compute stack.

Intel Says It Won't Compete With NVIDIA In AI Market, Shifts Focus Towards Bringing Cost-Effective AI Solutions With Gaudi 3

Intel has given up the AI race with NVIDIA & is now in pursuit of launching cost-effective solutions with its new Gaudi 3 AI offerings.

wccftech.com

KevinK · Oct 18, 2024

I think the whole AI world is moving past the "cool new models" phase into the the solutions phase. That's why you see AMD buying integrated solutions providers like Silo.ai

https://www.amd.com/en/newsroom/press-releases/2024-8-12-amd-completes-acquisition-of-silo-ai-to-accelerate.html

And NVIDIA offering customizable advanced models and genAI enterprise solutions infrastructure far beyond CUDA (NIMs and NEMO).

Nvidia just dropped a new AI model that crushes OpenAI’s GPT-4—no big launch, just big results

Nvidia quietly launched a groundbreaking AI model that surpasses OpenAI’s GPT-4 and Anthropic’s Claude 3.5, signaling a major shift in the competitive landscape of artificial intelligence.

venturebeat.com

Not sure how Intel makes a go of it, except at the cost sensitive periphery. Their problem is they spent too much time and energy trying to cook up/buy too many flavors of AI solutions, but unfortunately all centered around Xeon.

XYang2023 · Oct 18, 2024

KevinK said:
I think the whole AI world is moving past the "cool new models" phase into the the solutions phase. That's why you see AMD buying integrated solutions providers like Silo.ai

https://www.amd.com/en/newsroom/press-releases/2024-8-12-amd-completes-acquisition-of-silo-ai-to-accelerate.html

And NVIDIA offering customizable advanced models and genAI enterprise solutions infrastructure far beyond CUDA (NIMs and NEMO).

Nvidia just dropped a new AI model that crushes OpenAI’s GPT-4—no big launch, just big results

Nvidia quietly launched a groundbreaking AI model that surpasses OpenAI’s GPT-4 and Anthropic’s Claude 3.5, signaling a major shift in the competitive landscape of artificial intelligence.

venturebeat.com

Not sure how Intel makes a go of it, except at the cost sensitive periphery. Their problem is they spent too much time and energy trying to cook up/buy too many flavors of AI solutions, but unfortunately all centered around Xeon.

That is based on Llama 3 which can be run on Gaudi 3. If you distil and quantize it, it can be ran on AI PC. To use the model has very little to do with CUDA. They are simply weights.

KevinK · Oct 19, 2024

XYang2023 said:
That is based on Llama 3 which can be run on Gaudi 3. If you distil and quantize it, it can be ran on AI PC. To use the model has very little to do with CUDA. They are simply weights.

Yes, and no. You’re kind of missing the end intent - not just running a model. It indeed has gotten relatively easy to transfer, run and tune performance on specific hardware for existing models, without needing CUDA. Frameworks, like PyTorch, and compilers like Triton, make “porting“ models between hardware much easier.

But a model is just a model, not a full solution. If you’re an enterprise and you want to build a RAG-based system using your in-house source data and want to really boost accuracy for your specific needs via human alignment and hallucination reduction, you’re going to need truly open model like NVIDIA is offering, plus all the new kinds of flexible GenAI system-building infrastructure they delivered over the past couple years (NEMO and NIMS), or you’re going to need a solutions provider consulting team like AMD just bought. Getting a model running, then benchmarking, on your particular hardware is just step 1. And quite honestly, inference performance and latency for a model today mostly depends on whether a model fits into the GenAI processor‘s local memory, or if the model has to be spread across multiple substrates. But linking to corporate data, cleaning / filtering that data, building the entire system, and tuning for accuracy and safety, are much more challenging.

XYang2023 · Oct 19, 2024

KevinK said:
Yes, and no. You’re kind of missing the end intent - not just running a model. It indeed has gotten relatively easy to transfer, run and tune performance on specific hardware for existing models, without needing CUDA. Frameworks, like PyTorch, and compilers like Triton, make “porting“ models between hardware much easier.

But a model is just a model, not a full solution. If you’re an enterprise and you want to build a RAG-based system using your in-house source data and want to really boost accuracy for your specific needs via human alignment and hallucination reduction, you’re going to need truly open model like NVIDIA is offering, plus all the new kinds of flexible GenAI system-building infrastructure they delivered over the past couple years (NEMO and NIMS), or you’re going to need a solutions provider consulting team like AMD just bought. Getting a model running, then benchmarking, on your particular hardware is just step 1. And quite honestly, inference performance and latency for a model today mostly depends on whether a model fits into the GenAI processor‘s local memory, or if the model has to be spread across multiple substrates. But linking to corporate data, cleaning / filtering that data, building the entire system, and tuning for accuracy and safety, are much more challenging.

Intel supports:

Open Platform For Enterprise AI

Efficiently integrate secure, performant, and cost-effective Generative AI workflows into business value.

opea.dev

I am sure it can use Nvidia's version of Llama 3.

For enterprises, I don't think there is one solution that fits all. Difficulties for enterprises are various regulations.

Also not everyone needs a large model. Andrei Kapathy once said 1b model might be sufficient for most cases. You don't need a RAM1500 if a Camry can do the same job.

"Marc Andreessen says AI may be a race to the bottom, where selling intelligence turns out to be like selling rice, because it could turn out that anyone can make an LLM and no-one has any moat"

x.com

Jure · Oct 19, 2024

What about neuromorphic computing. Quick Google search shows that it is possible to run LLM on such a chip efficiently; link. Competing head-to-head with Nvidia is probably really futile, but offering an alternative that is much more cost effective for the user, might be worth it. Currently they only offer Loihi 2, which is a tiny experimental chip, but offering something more advanced might turn some heads. I wonder if they plan a successor to Loihi 2.

XYang2023 · Oct 19, 2024

Breaking into the AI hardware space dominated by Nvidia is no easy feat, and it's going to require a long-term approach.
- Targeting Gaudi 3 for enterprise solutions

Enterprise AI to Power Your Business | Intel & Inflection AI

Experience the future of enterprise AI with Inflection AI, powered by cutting-edge Intel hardware. Create and innovate business-critical solutions

inflection.ai

- Strengthening software support, such as:

Intel Boosts AI Development with Contributions to PyTorch 2.5

New features enhance the  programming experience for AI developers across data center and client hardware, expanding support for Intel GPUs.

www.intel.com

- Falcon shores launch by 2025
- Data center adoption of Falcon Shores
etc

KevinK · Oct 19, 2024

XYang2023 said:
I am sure it can use Nvidia's version of Llama 3.

OPEA is a basic reference implementation, not a productized GenAI platform. Andrei’s right - not everyone needs a big model. But enterprises need big systems that can flexibly scale up / out to deliver model+ services to many, many users or customers. Don‘t expect that to be delivered via client-side models and AI-PCs.

MKWVentures · Oct 19, 2024

i would suggest intel set and hit a goal of 5% share anywhere with its AI ... when you have 1% share after years of work. it is not really productive to claim a new focus and target. As i said before, it is not clear Intel is even a top 3 company in DCAI. lets get some revenue ... then we can focus.... just an opinion

XYang2023 · Oct 19, 2024

KevinK said:
OPEA is a basic reference implementation, not a productized GenAI platform. Andrei’s right - not everyone needs a big model. But enterprises need big systems that can flexibly scale up / out to deliver model+ services to many, many users or customers. Don‘t expect that to be delivered via client-side models and AI-PCs.

I think Intel's solution (via OPEA or not) can scale. I briefly had a look at this course:

Free Multimodal RAG Course from Intel Labs Available on DeepLearning.AI

Interested in learning more about Multimodal RAG? Want to quickly develop the expertise to create AI systems that can intelligently interact with video content? Then consider joining the over 10,000 developers that have taken the free short course, "Multimodal RAG: Chat with Videos", offered by...

community.intel.com

Intel® Gaudi® 3 AI Accelerator White Paper

This technical paper introduces the next-generation AI accelerator from Intel: the Intel® Gaudi® 3 AI accelerator. The paper provides technical and performance information regarding the new accelerator, including: overview, hardware system, architecture, host interface, compute, software suite...

www.intel.com

Search

Intel Says It Won’t Compete With NVIDIA In AI Market, Shifts Focus Towards Bringing Cost-Effective AI Solutions With Gaudi 3

fansink

Well-known member