Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/intel-says-it-won%E2%80%99t-compete-with-nvidia-in-ai-market-shifts-focus-towards-bringing-cost-effective-ai-solutions-with-gaudi-3.21257/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Intel Says It Won’t Compete With NVIDIA In AI Market, Shifts Focus Towards Bringing Cost-Effective AI Solutions With Gaudi 3

fansink

Well-known member
It will be interesting to see how the AI situation pans out for Intel moving into the future since the firm is in desperate need of a lifeline given the financial conditions they are going through.

Intel has given up the AI race with NVIDIA & is now in pursuit of launching cost-effective solutions with its new Gaudi 3 AI offerings.

Intel's Gaudi 3 AI GPUs Will Feature Industry-Leading Performance-Per-Dollar Value, But Won't Compete With NVIDIA As Blue Team Shifts Target Market
Well, it seems like Team Blue has finally realized that competing in the "compute power" race with NVIDIA isn't the viable path for them to have a sustainable business.

Instead, the firm is now tapping into a relatively unpopulated AI business segment, which is a cost-efficient implementation of AI accelerators, which is likely going to amuse a larger part of the industry. Based on a report by CRN, Intel is pitching its newest Gaudi 3 AI GPUs as a valuable offering, coming in with some of the best price-to-performance ratios available in the markets.

Nanduri said while Gaudi 3 is “not catching up” to Nvidia’s latest GPU from a head-to-head performance perspective, the accelerator chip is well-suited to enable economical systems for running task-based models and open-source models on behalf of enterprises, which is where the company has “traditional strengths.”

Intel claims that its Gaudi 3 lineup offers performance equivalent to NVIDIA's popular H100 AI accelerator, particularly in inferencing workloads, which have seen a massive rise following the debut of "reasoning-focused" LLM models. In terms of the actual figures, Intel claims that the Gaudi 3 AI accelerator provides 80% better performance-per-dollar value when compared with NVIDIA's H100, and when benchmarking at Llama-2, the performance-per-dollar difference goes up by 2x, which is indeed impressive.

The firm is solely pitching the new AI lineup as the best solution for small-scale startups and individuals looking to acquire AI computing power. However, when tested in floating-point operations, the Gaudi 3 AI GPUs do feel short of NVIDIA's alternatives, suggesting that hardcore AI performance isn't Intel's cup of tea for now.

Team Blue has realized that they can't compete with NVIDIA when it comes to hardware dominance. Surprisingly, the firm claims that they aren't looking towards capitalizing on demand from the mainstream market players. In the longer run, they believe that smaller LLM models will see wider adoption once the AI frenzy and the craze behind large-scale data centers fade away.

The world we are starting to see is people are questioning the [return on investment], the cost, the power and everything else. This is where—I don’t have a crystal ball—but the way we think about it is, do you want one giant model that knows it all.

We feel like where we are with the product, the customers that are engaged, the problems we're solving, that's our swim lane. The bet is that the market will open up in that space, and there'll be a bunch of people building their own inferencing solutions.

- Intel's Anil Nanduri via CRN

Intel's Gaudi 3 AI solutions have witnessed decent adoption from the industry, mainly by IBM Cloud, Hewlett Packard Enterprise, and even Dell in their respective data center products for the markets. We can't really say that Gaudi 3 isn't seeing the market spotlight, but Team Blue's bets on the AI markets haven't worked out too well, with previously the company's CEO Pat Gelsinger calling NVIDIA's CUDA as a moat, but it turned out to be a dominant compute stack.

 
I think the whole AI world is moving past the "cool new models" phase into the the solutions phase. That's why you see AMD buying integrated solutions providers like Silo.ai


And NVIDIA offering customizable advanced models and genAI enterprise solutions infrastructure far beyond CUDA (NIMs and NEMO).


Not sure how Intel makes a go of it, except at the cost sensitive periphery. Their problem is they spent too much time and energy trying to cook up/buy too many flavors of AI solutions, but unfortunately all centered around Xeon.
 
I think the whole AI world is moving past the "cool new models" phase into the the solutions phase. That's why you see AMD buying integrated solutions providers like Silo.ai


And NVIDIA offering customizable advanced models and genAI enterprise solutions infrastructure far beyond CUDA (NIMs and NEMO).


Not sure how Intel makes a go of it, except at the cost sensitive periphery. Their problem is they spent too much time and energy trying to cook up/buy too many flavors of AI solutions, but unfortunately all centered around Xeon.
That is based on Llama 3 which can be run on Gaudi 3. If you distil and quantize it, it can be ran on AI PC. To use the model has very little to do with CUDA. They are simply weights.
 
That is based on Llama 3 which can be run on Gaudi 3. If you distil and quantize it, it can be ran on AI PC. To use the model has very little to do with CUDA. They are simply weights.
Yes, and no. You’re kind of missing the end intent - not just running a model. It indeed has gotten relatively easy to transfer, run and tune performance on specific hardware for existing models, without needing CUDA. Frameworks, like PyTorch, and compilers like Triton, make “porting“ models between hardware much easier.

But a model is just a model, not a full solution. If you’re an enterprise and you want to build a RAG-based system using your in-house source data and want to really boost accuracy for your specific needs via human alignment and hallucination reduction, you’re going to need truly open model like NVIDIA is offering, plus all the new kinds of flexible GenAI system-building infrastructure they delivered over the past couple years (NEMO and NIMS), or you’re going to need a solutions provider consulting team like AMD just bought. Getting a model running, then benchmarking, on your particular hardware is just step 1. And quite honestly, inference performance and latency for a model today mostly depends on whether a model fits into the GenAI processor‘s local memory, or if the model has to be spread across multiple substrates. But linking to corporate data, cleaning / filtering that data, building the entire system, and tuning for accuracy and safety, are much more challenging.
 
Yes, and no. You’re kind of missing the end intent - not just running a model. It indeed has gotten relatively easy to transfer, run and tune performance on specific hardware for existing models, without needing CUDA. Frameworks, like PyTorch, and compilers like Triton, make “porting“ models between hardware much easier.

But a model is just a model, not a full solution. If you’re an enterprise and you want to build a RAG-based system using your in-house source data and want to really boost accuracy for your specific needs via human alignment and hallucination reduction, you’re going to need truly open model like NVIDIA is offering, plus all the new kinds of flexible GenAI system-building infrastructure they delivered over the past couple years (NEMO and NIMS), or you’re going to need a solutions provider consulting team like AMD just bought. Getting a model running, then benchmarking, on your particular hardware is just step 1. And quite honestly, inference performance and latency for a model today mostly depends on whether a model fits into the GenAI processor‘s local memory, or if the model has to be spread across multiple substrates. But linking to corporate data, cleaning / filtering that data, building the entire system, and tuning for accuracy and safety, are much more challenging.
Intel supports:
I am sure it can use Nvidia's version of Llama 3.

For enterprises, I don't think there is one solution that fits all. Difficulties for enterprises are various regulations.

Also not everyone needs a large model. Andrei Kapathy once said 1b model might be sufficient for most cases. You don't need a RAM1500 if a Camry can do the same job.

"Marc Andreessen says AI may be a race to the bottom, where selling intelligence turns out to be like selling rice, because it could turn out that anyone can make an LLM and no-one has any moat"
 
Last edited:
What about neuromorphic computing. Quick Google search shows that it is possible to run LLM on such a chip efficiently; link. Competing head-to-head with Nvidia is probably really futile, but offering an alternative that is much more cost effective for the user, might be worth it. Currently they only offer Loihi 2, which is a tiny experimental chip, but offering something more advanced might turn some heads. I wonder if they plan a successor to Loihi 2.
 
Breaking into the AI hardware space dominated by Nvidia is no easy feat, and it's going to require a long-term approach.
- Targeting Gaudi 3 for enterprise solutions
- Strengthening software support, such as:
- Falcon shores launch by 2025
- Data center adoption of Falcon Shores
etc
 
Last edited:
I am sure it can use Nvidia's version of Llama 3.
OPEA is a basic reference implementation, not a productized GenAI platform. Andrei’s right - not everyone needs a big model. But enterprises need big systems that can flexibly scale up / out to deliver model+ services to many, many users or customers. Don‘t expect that to be delivered via client-side models and AI-PCs.
 
i would suggest intel set and hit a goal of 5% share anywhere with its AI ... when you have 1% share after years of work. it is not really productive to claim a new focus and target. As i said before, it is not clear Intel is even a top 3 company in DCAI. lets get some revenue ... then we can focus.... just an opinion
 
OPEA is a basic reference implementation, not a productized GenAI platform. Andrei’s right - not everyone needs a big model. But enterprises need big systems that can flexibly scale up / out to deliver model+ services to many, many users or customers. Don‘t expect that to be delivered via client-side models and AI-PCs.
I think Intel's solution (via OPEA or not) can scale. I briefly had a look at this course:

 
Back
Top