Behind the plot to break Nvidia’s grip on AI by targeting software

Daniel Nenni · Mar 25, 2024

SAN FRANCISCO, March 25 (Reuters) - Nvidia (NVDA.O), earned its $2.2 trillion market cap by producing artificial-intelligence chips that have become the lifeblood powering the new era of generative AI developers from startups to Microsoft (MSFT.O), opens new tab, OpenAI and Google parent Alphabet (GOOGL.O).

Almost as important to its hardware is the company’s nearly 20 years' worth of computer code, which helps make competition with the company nearly impossible. More than 4 million global developers rely on Nvidia's CUDA software platform to build AI and other apps.

Now a coalition of tech companies that includes Qualcomm (QCOM.O), opens new tab, Google and Intel (INTC.O), opens new tab plans to loosen Nvidia’s chokehold by going after the chip giant’s secret weapon: the software that keeps developers tied to Nvidia chips. They are part of an expanding group of financiers and companies hacking away at Nvidia's dominance in AI.

"We're actually showing developers how you migrate out from an Nvidia platform," Vinesh Sukumar, Qualcomm's head of AI and machine learning, said in an interview with Reuters.

Starting with a piece of technology developed by Intel (INTC.O), called OneAPI, the UXL Foundation, a consortium of tech companies, plans to build a suite of software and tools that will be able to power multiple types of AI accelerator chips, executives involved with the group told Reuters. The open-source project aims to make computer code run on any machine, regardless of what chip and hardware powers it.

"It's about specifically - in the context of machine learning frameworks - how do we create an open ecosystem, and promote productivity and choice in hardware," Google's director and chief technologist of high-performance computing, Bill Hugo, told Reuters in an interview. Google is one of the founding members of UXL and helps determine the technical direction of the project, Hugo said.

UXL's technical steering committee is preparing to nail down technical specifications in the first half of this year. Engineers plan to refine the technical details to a "mature" state by the end of the year, executives said. These executives stressed the need to build a solid foundation to include contributions from multiple companies that can also be deployed on any chip or hardware.

Beyond the initial companies involved, UXL will court cloud-computing companies such as Amazon.com (AMZN.O), opens new tab and Microsoft's Azure, as well as additional chipmakers.

Since its launch in September, UXL has already begun to receive technical contributions from third parties that include foundation members and outsiders keen on using the open-source technology, the executives involved said. Intel's OneAPI is already useable, and the second step is to create a standard programming model of computing designed for AI.

UXL plans to put its resources toward addressing the most pressing computing problems dominated by a few chipmakers, such as the latest AI apps and high-performance computing applications. Those early plans feed in to the organization's longer-term goal of winning over a critical mass of developers to its platform.
UXL eventually aims to support Nvidia hardware and code, in the long run.

When asked about the open source and venture-funded software efforts to break Nvidia’s AI dominance, Nvidia executive Ian Buck said in a statement: "The world is getting accelerated. New ideas in accelerated computing are coming from all across the ecosystem, and that will help advance AI and the scope of what accelerated computing can achieve."

NEARLY 100 STARTUPS

The UXL Foundation's plans are one of many efforts to chip away at Nvidia's hold on the software that powers AI. Venture financiers and corporate dollars have poured more than $4 billion into 93 separate efforts, according to custom data compiled by PitchBook at Reuters’ request.

The interest in unseating Nvidia through a potential weakness in software has ramped up in the last year, and startups aiming to poke holes in the company's leadership gobbled up just over $2 billion in 2023 compared with $580 million from a year ago, according to the data from PitchBook.

Success in the shadow of Nvidia's group on AI data crunching is an achievement that few of the startups will be able to achieve. Nvidia's CUDA is a compelling piece of software on paper, as it is full-featured and is consistently growing both from Nvidia's contributions and the developer community.

"But that's not what really matters," said Jay Goldberg, chief executive of D2D Advisory, a finance and strategy consulting firm. "What matters is the fact that people have been using CUDA for 15 years, they built code around it."

https://www.reuters.com/technology/behind-plot-break-nvidias-grip-ai-by-targeting-software-2024-03-25/

KevinK · Mar 25, 2024

Given the AI software advances NVIDIA announced for enterprise developers last week, above the CUDA level, Intel and Qualcomm had to do something. When you don’t have the full goods you start a new standard. Sometimes that works, sometimes it doesn’t. AMD is notably absent from this effort.

Daniel Nenni · Mar 25, 2024

KevinK said:
Given the AI software advances NVIDIA announced for enterprise developers last week, above the CUDA level, Intel and Qualcomm had to do something. When you don’t have the full goods you start a new standard. Sometimes that works, sometimes it doesn’t. AMD is notably absent from this effort.

I'm waiting for the government to step in to break up the Nvidia monopoly. They are doing Nvidia a favor by giving the illusion that there is an alternative. Hopefully it materializes but consortiums are challenging in itself.

KevinK · Mar 25, 2024

Daniel Nenni said:
Hopefully it materializes but consortiums are challenging in itself.

Absolutely. Generally these things are hard because of the dynamics of so many players. My experience tells me that winning with a "standard" takes three things.

* The players involved have to have "the goods" - technology pieces that can match the incumbent. Having Intel supply the base is both a blessing (a large chunk of well-targeted code and architecture) and a curse (how much is One API tainted by Intel's view on what's needed).

* All the players have to be able to assemble all the pieces into cogent architecture that mostly levels the playing field. That means committing their best system/software architects to this project over their originally planned approach. That's a hard swallow, and can't be contracted out, since NVIDIA certainly had their best people on it.

* All the players have to adopt as their mainstream platform, and they have to create critical mass. That means two things: 1) Each participant has to put their best product developers on using the standard as their primary platform - they can't have a second team building an alternate approach and 2) They have to have the second place plate onboard and committed - I don't see this "consortium" as reaching critical mass unless AMD is 100% in.

But this consortium also relieves NVIDIA of some of the "monopoly" claims - now there's credible industry competition, even though nothing real is available yet.

blueone · Mar 25, 2024

The UXL Foundation is the second I-Hate-Nvidia-Club to get started lately. The first was the Ultra Ethernet Consortium, which aims to displace Nvidia's InfiniBand implementation (the only viable one available). It's sort of amusing. The Ultra Ethernet Consortium is led by AMD, while UXL seems to be dominated by Intel. Intel is a member of both, AMD is not a member of UXL, as far as I can tell. Nvidia is a member of neither one.

I can't get excited about either of these, especially UEC. (I wonder how UEC will get IEEE 802.1/3 and the IETF on board with their "enhancements".) UXL needs a strong GPU supplier to be a member, and doesn't have one, unless they think Intel will get competitive in GPUs. OneAPI's success so far seems to be in FPGAs, although I admit I don't watch it that closely. AMD appears wedded to PyTorch, which is easier to use than either OneAPI or CUDA, so I'm having trouble seeing AMD getting on board with UXL.

hist78 · Mar 25, 2024

KevinK said:
Absolutely. Generally these things are hard because of the dynamics of so many players. My experience tells me that winning with a "standard" takes three things.

* The players involved have to have "the goods" - technology pieces that can match the incumbent. Having Intel supply the base is both a blessing (a large chunk of well-targeted code and architecture) and a curse (how much is One API tainted by Intel's view on what's needed).

* All the players have to be able to assemble all the pieces into cogent architecture that mostly levels the playing field. That means committing their best system/software architects to this project over their originally planned approach. That's a hard swallow, and can't be contracted out, since NVIDIA certainly had their best people on it.

* All the players have to adopt as their mainstream platform, and they have to create critical mass. That means two things: 1) Each participant has to put their best product developers on using the standard as their primary platform - they can't have a second team building an alternate approach and 2) They have to have the second place plate onboard and committed - I don't see this "consortium" as reaching critical mass unless AMD is 100% in.

But this consortium also relieves NVIDIA of some of the "monopoly" claims - now there's credible industry competition, even though nothing real is available yet.

It's hard to believe Intel is a good fit for each principle/requirement you mentioned.

blueone · Mar 25, 2024

KevinK said:
Absolutely. Generally these things are hard because of the dynamics of so many players. My experience tells me that winning with a "standard" takes three things.

* The players involved have to have "the goods" - technology pieces that can match the incumbent. Having Intel supply the base is both a blessing (a large chunk of well-targeted code and architecture) and a curse (how much is One API tainted by Intel's view on what's needed).

OneAPI is based on SYCL, which is a multi-vendor industry specification.

SYCL - C++ Single-source Heterogeneous Programming for Acceleration Offload

www.khronos.org

KevinK said:
* All the players have to be able to assemble all the pieces into cogent architecture that mostly levels the playing field. That means committing their best system/software architects to this project over their originally planned approach. That's a hard swallow, and can't be contracted out, since NVIDIA certainly had their best people on it.

OneAPI and CUDA are based on some similar concepts, such as C++ and manual memory management. CUDA supports C, C++, and Fortran, while OneAPI only appears to support Intel's open source Data Parallel C++ compiler. PyTorch is of course based on Python, so it uses dynamic runtime memory management. Being a dinosaur myself, I prefer manual memory management for the better performance.

All of this techno-gibberish aside, neither PyTorch nor OneAPI has the richness, breadth, completeness and maturity of CUDA. CUDA also has zillions more trained users than any other alternative. I can see why AMD chose PyTorch, and they are definitely investing substantially to improve their implementation, but since CUDA is already a winner I can't see the nuts and bolts CUDA programmers be willing to take PyTorch jobs.

KevinK said:
* All the players have to adopt as their mainstream platform, and they have to create critical mass. That means two things: 1) Each participant has to put their best product developers on using the standard as their primary platform - they can't have a second team building an alternate approach and 2) They have to have the second place plate onboard and committed - I don't see this "consortium" as reaching critical mass unless AMD is 100% in.

AMD clearly isn't.

KevinK said:
But this consortium also relieves NVIDIA of some of the "monopoly" claims - now there's credible industry competition, even though nothing real is available yet.

That doesn't seem to bother Lina Kahn's FTC. If she goes after Nvidia, I doubt UXL will sway her at all. I think what might save Nvidia is that GPUs are so deeply technical that she won't be able to drum up popular support, which she appears to crave more than lawful competition.

Brady · Mar 25, 2024

By all means, Nvidia’s death grip on the leading edge of ML compute needs real competition but…

tooLongInEDA · Mar 26, 2024

Daniel Nenni said:
I'm waiting for the government to step in to break up the Nvidia monopoly. They are doing Nvidia a favor by giving the illusion that there is an alternative. Hopefully it materializes but consortiums are challenging in itself.

Why should they ?

Creating a monopoly by fair competition isn't illegal, is it ? Don't you actually need to prove anti-competitive behaviour or establish that nVidia has actually done something wrong - other than just being very good at their job ? Should TSMC be broken up for being too successful ?

Daniel Nenni · Mar 26, 2024

tooLongInEDA said:
Why should they ?

Creating a monopoly by fair competition isn't illegal, is it ? Don't you actually need to prove anti-competitive behaviour or establish that nVidia has actually done something wrong - other than just being very good at their job ? Should TSMC be broken up for being too successful ?

Because the US Government does stupid things. It doesn't really matter how the monopoly was created, what matters is if you act like one and piss off the government. I was being sarcastic when I originally said this but I was thinking about onshoring semiconductor manufacturing. Isn't that an anti monopoly move? Call it geopolitics or supply chain management but it also seems a bit like anti monopoly, especially now that the US Government is getting involved.

Tanj · Mar 27, 2024

blueone said:
OneAPI and CUDA are based on some similar concepts, such as C++ and manual memory management. CUDA supports C, C++, and Fortran, while OneAPI only appears to support Intel's open source Data Parallel C++ compiler. PyTorch is of course based on Python, so it uses dynamic runtime memory management. Being a dinosaur myself, I prefer manual memory management for the better performance.

OneAPI, CUDA, and ROCM are intermediate levels. PyTorch and TensorFlow are modelling languages (though TF has its own modelling layer when compiling to TPUs, it uses CUDA or ROCM as intermediates too). No-one builds models in intermediate languages anymore, though they do import vendor-specific libraries from them into Python when PyTorch can't figure out the compilation well enough.

If Modular or others succeed in building better graph compilers from PyTorch to the intermediate languages in principle the need to import libraries specific to a target will disappear. The algorithms are already pretty high level and generic mathematics so leaving it to a compiler is attractive. Modular have a deep bench and their stuff is already pretty good. Some of the modelling houses are already using some of the Mojo code generation tools.

At the chip level it is mostly vector/array/tensor MACs, some non-linear transforms, packing and unpacking standardized formats, and moving data around. CUDA got its start because graphics GPUs were weird and some strange libraries were needed to make them effective for AI - but current competitors were built only to do AI, no strangeness needed. A lot of the useful transforms are things an APL programmer from 50 years ago would recognize, so it is questionable how much of a moat there really will be around "do big math fast" when all the archaic graphics clutter is out of the way. Certainly for inference chips there seem to be a lot of competent solutions in the market and plenty of economic incentive to optimize for them.

blueone · Mar 27, 2024

Tanj said:
OneAPI, CUDA, and ROCM are intermediate levels. PyTorch and TensorFlow are modelling languages (though TF has its own modelling layer when compiling to TPUs, it uses CUDA or ROCM as intermediates too). No-one builds models in intermediate languages anymore, though they do import vendor-specific libraries from them into Python when PyTorch can't figure out the compilation well enough.

I see what you're saying, but OneAPI doesn't really fit with CUDA and ROCM, because it is not GPU specific like the other two. Intel is addressing the targeted application problem with OneAPI libraries (e.g. DDN, VINO, etc), but I agree that it doesn't offer a framework tool like PyTorch, though they talk a lot about OneAPI supporting PyTorch on their website, but clarity about exactly what they're doing is eluding me.

I'm not fully understanding your point about vendor-specific libraries. PyTorch already does that with Nvidia. Or are you making a different point?

Tanj said:
If Modular or others succeed in building better graph compilers from PyTorch to the intermediate languages in principle the need to import libraries specific to a target will disappear. The algorithms are already pretty high level and generic mathematics so leaving it to a compiler is attractive. Modular have a deep bench and their stuff is already pretty good. Some of the modelling houses are already using some of the Mojo code generation tools.

How do you think this will be the case with so many emerging AI processors which are not GPUs or TPUs?

Tanj said:
At the chip level it is mostly vector/array/tensor MACs, some non-linear transforms, packing and unpacking standardized formats, and moving data around. CUDA got its start because graphics GPUs were weird and some strange libraries were needed to make them effective for AI - but current competitors were built only to do AI, no strangeness needed. A lot of the useful transforms are things an APL programmer from 50 years ago would recognize, so it is questionable how much of a moat there really will be around "do big math fast" when all the archaic graphics clutter is out of the way. Certainly for inference chips there seem to be a lot of competent solutions in the market and plenty of economic incentive to optimize for them.

"A lot of competent solutions" looks problematic for broad deployment, assuming they're all different at the chip level. I wonder how many alternatives the market will support? (I doubt more than two or three.)

APL was still in use in the late 1980s.

I know someone who misses it.

TBiggs · Mar 27, 2024

Nvidia and TSMC almost is WINTEL, but…
They have capitalization similar to WINTEl, but what is different in those days we don’t have the infrastructure and business situation nor companies where it is so easy and competitive for them to do AI customized inside.

My prediction Nvidia is leadership will be far more short lived than WINTEL’s time of dominance.

hist78 · Mar 27, 2024

tooLongInEDA said:
Why should they ?

Creating a monopoly by fair competition isn't illegal, is it ? Don't you actually need to prove anti-competitive behaviour or establish that nVidia has actually done something wrong - other than just being very good at their job ? Should TSMC be broken up for being too successful ?

I think this is a 21st century new approach.

When several incumbent monopolies suddenly found out that while they were enjoying enormous profits and luxury vacations, something happened. There is a stupid company actually has been working stupidly for the past 20 years and built a product and ecosystem that people really want to buy, even at the price of $40,000 a piece. Let's just call that stupid company Nvidia (or some other names if you feel it's not respectful).

The incumbent monopoly group came out a game plan to save the innocent public from the potential damages that will be caused by the Nvidia. The potential damages include things like lack of choices, price gauging, slow technology innovation, and anti-competitive behaviors. The incumbent monopoly group feel they know it first hand because they have been there and done that with excellent track records.

The incumbent monopoly group along with several reputable and inspiring companies formed the UXL Foundation Steering Committee.

BTW, the UXL Foundation is "hosted" by the Linux Foundation's Joint Development Foundation (JDF). When you pay the UXL membership fees, certain portion of that will go to Linux Foundation. Don't ask me the the meanings of "hosted" and the relationship between Linux Foundation, JDF, and UXL. I just copied those words from the UXL Foundation website..

The UXL Steering Committee is leading this opensource/open standard AI effort to help define or develop software and specifications to run on non-Nvidia chip platforms.

The current members of the UXL Steering Committee are:

Some information about the member companies with this UXL Steering Committee:

Rod Burns, Chairperson of UXL Steering Committee, VP of Codeplay Software.
My Comments: You may not know Rod Burns and Codeplay Software. That's OK because Codeplay is a great UK software company. They are so great that Intel bought them in 2022.

Robert Cohn, Senior Principal Engineer, Intel
My Comments: : Is Intel a monopoly? I turned this question to my trusted and AI enabled Google Search. Here is the Google Search result:

"Yes, Intel is considered a monopoly because it is the largest chip manufacturer and has over 70% of the market share in PC processors. Intel has built a brand around a commodity, and in 1991, they convinced manufacturers to put the "Intel inside" logo in their advertising and on their products. This became the first trademark of the computer industry."

Penporn Koanantakool, Senior Software Engineer, Google
My comments: Is Google a monopoly? Again, I asked this important question to my trusted and AI enabled Google Search. I know I am in good hands with Google. Here is the Google Search result:

"Google is not a monopoly, but it is part of an oligopoly, which is a market where a few companies control the entire market. Google is the leader in the search engine market, with over 90% of internet users using Google for their online searches. The US Department of Justice has filed a suit against Google, alleging that its control of the online search market violates antitrust regulations. The Justice Department claims that Google's exclusive agreements with phone makers, like Apple and Samsung, and web browsers, like Mozilla, allow Google to be the default search engine on most devices. The Justice Department alleges that this position has allowed Google to box out smaller rivals."

Andrew Wafaa, Senior Director Software Communities & Fellow, Arm
My comments: Arm is a great company. I have no bad impression on them.

Dr. Priyanka Sharma, Director of Software Engineering, Fujitsu Research
Masahiro Doteguchi, Senior Engineering Manager, Fujitsu
My comments: It got be an honor for them to work for a company bears the same name of another great software company also called Fujitsu. Between 1999 and 2015, over 900 British Post Office workers were convicted of theft, fraud and false accounting with the help of a software provided by Fujitsu.

Reference: https://en.wikipedia.org/wiki/British_Post_Office_scandal

Dave Murray, AI Software Product Management, Imagination Technologies
My comments: Imagination Technologies is still there! I'm surprised. Actually Imagination got a very reputable and strong backer. I learned it from the Microsoft Copilot Search powered by ChatGPT:

"Imagination Technologies is currently owned by Canyon Bridge Capital Partners, a private equity fund based in Beijing that is ultimately owned by the Chinese government1. The company, known for its semiconductor and software design, including PowerVR mobile graphics processors, was acquired by Canyon Bridge in November 2017".

Dr. Vinesh Sukumar, Head of AI/ML Product Management, Qualcomm
My comments: Is Qualcomm a monopoly? Microsoft Copilot gave me the following answer:

"Qualcomm has faced legal challenges regarding its business practices. A U.S. district court judge ruled that Qualcomm had wrongfully suppressed competitors in the wireless chip market and used its dominant position to force unnecessary licensing fees, thus acting as a monopoly1. However, the case is complex, and Qualcomm has sought to appeal the decision1. Additionally, opinions on the matter can vary, and different regulatory bodies may have their own conclusions. For instance, a Japanese regulatory body reversed its earlier decision, stating that Qualcomm is not a monopoly after a decadelong investigation. The situation is nuanced and subject to legal interpretation and regulatory review."

Hanwoong Jung, Project Leader of Universal Deep Learning Compiler (UDLC), Samsung Electronics
My comments: Through three generations of leaders from the Lee family, Samsung always promotes free market spirit, honest corporate culture, and fair competition and always conducts its business according to the law.

Ramesh Radhakrishnan, Technical Director, OCTO HPC/ML, VMware
My comments: VMware is a subsidy of Broadcom after the $69 billion acquisition. It probably should be replaced by someone from Broadcom (a Nvidia competitor?)

KevinK · Mar 28, 2024

Tanj said:
CUDA got its start because graphics GPUs were weird and some strange libraries were needed to make them effective for AI - but current competitors were built only to do AI, no strangeness needed.

I think you’re talking about the CUDA of 5-10 years ago. Since Volta and CUDA 8.0 (2017) the focus has been on ML/AI and new tuned data representations and calculations (newest FP4 and transformer engines) for training and inference, as well as parallel scatter/gather across chips/cards/racks especially for training. Plus NVIDIA has added commercialization layers on top of the CuDA/modeling and frameworks (pyTorch/TF) base. It will be interesting to see how model development, deployment and integration platforms like NEMO will be adopted by enterprises. By my estimation, we’ve already moved past the point where enterprises develop their own AI/ML apps and are instead looking for a platform to customize, integrate and deploy existing models.

Search

Behind the plot to break Nvidia’s grip on AI by targeting software

Daniel Nenni

Admin

NEARLY 100 STARTUPS

KevinK

Active member

Daniel Nenni

Admin

KevinK

Active member

blueone

Well-known member

hist78

Well-known member

blueone

Well-known member

SYCL - C++ Single-source Heterogeneous Programming for Acceleration Offload

Brady

Member

tooLongInEDA

Moderator

Daniel Nenni

Admin

Tanj

Well-known member

blueone

Well-known member

TBiggs

Member

hist78

Well-known member

KevinK

Active member

Behind the plot to break Nvidia’s grip on AI by targeting software

Admin

NEARLY 100 STARTUPS​

Active member

Admin

Active member

Well-known member

Well-known member

Well-known member

Member

Moderator

Admin

Well-known member

Well-known member

Member

Well-known member

Active member

NEARLY 100 STARTUPS