Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/pat-gelsinger-ai-is-a-moral-risk.22922/page-2
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

Pat Gelsinger: AI is a Moral Risk

I was an honest believer in Pat myself; but he also had a lot of execution issues too that shouldn't have happened:

- 13th and 14th gen CPU reliability issues; why did it take so long to acknowledge the problem with customers and a year from initial detection to any kind of resolution?

- Arrow Lake launch; underperforming, reviews not matching internal data. Fixes only coming 5 months after launch.

- 20A cancellation (if his products were selling well enough then 20A would have been justified)

- Intel ARC, terrible launch (though it was under way just as Pat was joining and Raja fumbled this hard), and Battlemage is a bit too slow of a burn vs. the potential market

A lot of brand tarnishing here unfortunately, so even I've gone from "love Pat" to "uhh.. you should have done a lot better". I think he should get credit for fixing the server product pipeline ("not broken anymore") and for getting Intel successfully WELL past the 10nm debacle, but .. there's a lot of pain too.

Except for Intel 20A cancellation (which affects the direction of the company), the other stuff you mentioned are usually dealt by the heads of those verticals. CEO is there to set the strategy of the company and drive the company towards that goal. They don't deal with day to day executions. All of this is why I believe Michelle (the current Product CEO who was running CCG under Pat) should be the one who needs to be fired. Of all the criticism of Pat, I agree that he was too nice and should have fired a bunch of people for mishandling these things you highlighted.


But 13th/14th gen issue was a very difficult issue to root cause based on what I have read and Tech media jumped on it and over sensationalized it. GN especially with the nonsense about via oxidation etc. Again CCG!

IMO, ARL is not a bad product, it is a competitive product vs Zen 5 vanilla. Releasing arrow lake without an x3D competitor was a blunder. Again CCG!. Now there are rumors they are going to launch arrow lake refresh this year (some rumors indicate only K SKUs and HX SKUs will be refreshed). If there is no x3D competitor and no fix for cache\memory latencies, they are going to fail again in gaming. But ARL should sell well in Notebooks (consumer + commercial) and Desktops (commercial) due to Perf/watt improvement over RPL.

I don't have an answer for cancellation of 20A. Intel claims they saved money with this decision. After that Intel 18A got a bit nerfed on performance in Intel’s slides (previously 20A was 1.15x Intel 3 perf/watt and Intel 18A was 1.1x 20A which changed to Intel 18A is upto 1.15x Intel 3) and people claimed that 20A actually became 18A and OG 18A became the 18AP. Who really knows? Important thing is when Pat took over, Intel was on 10nm and AMD was on TSMC N5/N4. Now this year because of Pat's focus on 18A and node process leadership, Intel will be on 18A and AMD will be on N4\N3. That is a huge achievement if 18A’s paper performance shows up in real world product performance.
Also let’s not forget the Lunar Lake’s success, Lunar lake kind of saved x86 against Qualcomm's Windows on ARM threat for now and Battlemage is huge success too. So lets not forget the wins too.

Look, I am not saying Pat Gelsinger was the perfect CEO. He was not, he was dealt a bad hand and did the best he could and made a bunch of mistakes & achievements on the way. Dude had a good thing going in VMware as CEO after turning things around there, he was voted as the Best CEO during his tenure there. He could have comfortably retired in there. But he chose to come back to Intel to help turn things around believing God chose him for this role. He had a bold strategy and did a good job (imho even though he got shit from investors along the way) as I firmly believe Foundry is the future for Intel (not the design house, as x86 is being disrupted on all fronts). That's admirable imho.
 
Last edited:
Except for Intel 20A cancellation (which affects the direction of the company), the other stuff you mentioned are usually dealt by the heads of those verticals. CEO is there to set the strategy of the company and drive the company towards that goal. They don't deal with day to day executions. All of this is why I believe Michelle (the current Product CEO who was running CCG under Pat) should be the one who needs to be fired. Of all the criticism of Pat, I agree that he was too nice and should have fired a bunch of people for mishandling these things you highlighted.


But 13th/14th gen issue was a very difficult issue to root cause based on what I have read and Tech media jumped on it and over sensationalized it. GN especially with the nonsense about via oxidation etc. Again CCG!

IMO, ARL is not a bad product, it is a competitive product vs Zen 5 vanilla. Releasing arrow lake without an x3D competitor was a blunder. Again CCG!. Now there are rumors they are going to launch arrow lake refresh this year (some rumors indicate only K SKUs and HX SKUs will be refreshed). If there is no x3D competitor and no fix for cache\memory latencies, they are going to fail again in gaming. But ARL should sell well in Notebooks (consumer + commercial) and Desktops (commercial) due to Perf/watt improvement over RPL.

I don't have an answer for cancellation of 20A. Intel claims they saved money with this decision. After that Intel 18A got a bit nerfed on performance in Intel’s slides (previously 20A was 1.15x Intel 3 perf/watt and Intel 18A was 1.1x 20A which changed to Intel 18A is upto 1.15x Intel 3) and people claimed that 20A actually became 18A and OG 18A became the 18AP. Who really knows? Important thing is when Pat took over, Intel was on 10nm and AMD was on TSMC N5/N4. Now this year because of Pat's focus on 18A and node process leadership, Intel will be on 18A and AMD will be on N4\N3. That is a huge achievement if 18A’s paper performance shows up in real world product performance.
Also let’s not forget the Lunar Lake’s success, Lunar lake kind of saved x86 against Qualcomm's Windows on ARM threat for now and Battlemage is huge success too. So lets not forget the wins too.

Look, I am not saying Pat Gelsinger was the perfect CEO. He was not, he was dealt a bad hand and did the best he could and made a bunch of mistakes & achievements on the way. Dude had a good thing going in VMware as CEO after turning things around there, he was voted as the Best CEO during his tenure there. He could have comfortably retired in there. But he chose to come back to Intel to help turn things around believing God chose him for this role. He had a bold strategy and did a good job (imho even though he got shit from investors along the way) as I firmly believe Foundry is the future for Intel (not the design house, as x86 is being disrupted on all fronts). That's admirable imho.
No doubt that Intel was a mess when Pat Gelsinger took over. Also agree Michelle should be fired long ago along with other few ladies (The most famous of which was in charge of DCAI then was moved to Altera). They were pure DEI personnel without any meaningful backround and achievement for fit theri roles. That is why I applauded DJT when he started the fight against DEI. Intel is a perfect example how DEI could destroty a great company (Dont forget the king of diversity here).

I like Pat Gelsinger for his boldness on pursuing technology leadership. I still think his strategy is the only way to save intel (and maybe US semiconductor manufacturing). But the problem is on the execution. I think there are a few missteps maybe becuase of his over optimism and kindess as you said (which I tentatively disagree). 1) It was too late to stop the dividend payment. He should have stopped it immediately when he took over. 2) He should have fired a lot of VPs including the ladies and Mr. GPU guy as quickly as possible. Those ldaies do not have necessary eduction background and experience to make any decisions. Mr. GPU guy is very good at talking but always lack of delivery. Intel need trim a lot of underperforming engineers and managers. When you have so many DEI hires and many people only work from 10 AM and 2PM, you are not going to have a productive team. Pat Gelsinger probably never realized that and he was forced to lay off 15% employees only disaterous earning report last August. And that layoff did not go well since I heard many capable people left with big packages and many incapable people stayed.

Pat's even bigger issue was that he is very cocky in a way I sometimes feel he is like another DJT. He seems very reluctant to recoginize/acknowledge other people's achievement. There have been a lot of discussions already so I dont repeat here. He is not good at pushing people to the opposite as DJT always does. Finally Pat's biggest problem is lack of financial discipline. I understand intel needs heavy investement to play the catch-up game with 5N4Y plan. But dont you need adjust capex accordingly with shrinking revenue/income or do reorg/layoff to save the cost? I have not seen his any proactive plan for that (maybe CFO should take the blame too). He failed the CEO role completely on this.
 
Failed to take advantage of Gaudi? Care to explain how? Isn't Gaudi's incompatibility with OneAPI and popular software stack and lack of future roadmap the reason for Intel's failure in AI?
OneAPI isn't especially relevant for AI. OneAPI exists so you can write a program in a high-level language, like DPC++, and have it run on various hardware processing architectures, like FPGAs, CPUs, or GPUs without hardware-specific modifications. While One API is very interesting for many uses, the key interface for Gaudi in AI is PyTorch, which it does support.

As for proof Intel failed to take advantage of Gaudi, total 2024 Gaudi revenue was less than $500 million. Pathetic, for what is really some nice silicon.
 
OneAPI isn't especially relevant for AI.
OneAPI is the Intel's direct comp for CUDA from nVidia & ROCm from AMD. Like I said, not having support for that in Gaudi 2 & 3 is being cited as the reason it has not had any success outside of fringe cases.

Another issue is there is no future roadmap for that product, after Gaudi 3. The nex gen was going to be Falcon shores which is a GPU that was presumed to support OneAPI.
From what I read, even PyTorch does not work get go on Gaudi, I heard only SynapseAI can be used to program Gaudis.
So, lot of porting back and forth which is why Gaudi is not working out for Intel.
This is what I have read online and full disclosure I am not AI software guy, so could be missing many things here.

Also, the fact Gaudi could not even muster $500 million shows it is in fact not some "nice silicon" for that specific market!

But the point I was making was any Investor banking on billions of AI revenue for Intel in 2023 & 2024, even 2025 did not do due diligence should not be investing money in individual stocks. I recommend them to put their money into low-cost index fund rather than expecting apologies from CEOs.
 
OneAPI is the Intel's direct comp for CUDA from nVidia & ROCm from AMD. Like I said, not having support for that in Gaudi 2 & 3 is being cited as the reason it has not had any success outside of fringe cases.

Another issue is there is no future roadmap for that product, after Gaudi 3. The nex gen was going to be Falcon shores which is a GPU that was presumed to support OneAPI.
From what I read, even PyTorch does not work get go on Gaudi, I heard only SynapseAI can be used to program Gaudis.
So, lot of porting back and forth which is why Gaudi is not working out for Intel.
This is what I have read online and full disclosure I am not AI software guy, so could be missing many things here.

Also, the fact Gaudi could not even muster $500 million shows it is in fact not some "nice silicon" for that specific market!

But the point I was making was any Investor banking on billions of AI revenue for Intel in 2023 & 2024, even 2025 did not do due diligence should not be investing money in individual stocks. I recommend them to put their money into low-cost index fund rather than expecting apologies from CEOs.
There are also Intel employees who hold Intel shares.
 
OneAPI is the Intel's direct comp for CUDA from nVidia & ROCm from AMD. Like I said, not having support for that in Gaudi 2 & 3 is being cited as the reason it has not had any success outside of fringe cases.
Not correct. Not correct that OneAPI is directly comparable to CUDA, and not correct that lack of OneAPI support is the reason why Gaudi has not been successful. OneAPI, while innovative and useful for making multi-platform code practical, is not a huge success on its own either.
Another issue is there is no future roadmap for that product, after Gaudi 3. The nex gen was going to be Falcon shores which is a GPU that was presumed to support OneAPI.
From what I read, even PyTorch does not work get go on Gaudi, I heard only SynapseAI can be used to program Gaudis.
This is the latest Gaudi software architecture:


1748874482260.png

Also, the fact Gaudi could not even muster $500 million shows it is in fact not some "nice silicon" for that specific market!
I'm basing my opinion on my analysis of Gaudi's microarchitecture.
But the point I was making was any Investor banking on billions of AI revenue for Intel in 2023 & 2024, even 2025 did not do due diligence should not be investing money in individual stocks. I recommend them to put their money into low-cost index fund rather than expecting apologies from CEOs.
I don't know of any investors who made an AI bet with Intel. PG supported the bet on AI accelerators in x86 CPUs, like AMX and AVX-512. Typical PG.
 
Finally Pat's biggest problem is lack of financial discipline. I understand intel needs heavy investement to play the catch-up game with 5N4Y plan. But dont you need adjust capex accordingly with shrinking revenue/income or do reorg/layoff to save the cost? I have not seen his any proactive plan for that (maybe CFO should take the blame too).
Capital expenditures have direct correlation to future growth. That is exactly what Bob Swan did, reduce growth capex to support stock buybacks and look where that took Intel in CPUs. In industries like semiconductors, reducing investments during a downturn is recipe for losing competitive advantages. Ask Morris Chang, he is a big believer in investing during downturns.

Also, you are talking as if Intel did not adjust investments under PG, he did reduce headcounts in 2022 & 2023 (15% cut - $nearly 4B in savings for 2025), and capex (nearly $1.75billion) in 2024. Intel says they saved a billion $$ by not ramping 20A, if true that is a cost saving initiative too. CFO had already hinted capex will come down in 2025 during investment conferences under PG, which is currently guided for $18Billion (previously was $20B). And he managed to lobby hard to get CHIPS grants as well to offset some capex.

1748874969124.png



Even Lip Bu Tan could do is reduce Opex by $500 million (changed guidance from $17.B opex to $17B) and capex by $2 Billion this year (was $20B now $18B).
 
Not correct that OneAPI is directly comparable to CUDA
"Starting with a piece of technology developed by Intel (INTC.O), opens new tab called OneAPI, the UXL Foundation, a consortium of tech companies, plans to build a suite of software and tools that will be able to power multiple types of AI accelerator chips, executives involved with the group told Reuters. "
I don't know of any investors who made an AI bet with Intel.
@blueone "Not nonsense to me. Lack of winning strategies in datacenter AI and datacenter networking,"
Isn't that one of the reasons you cited as the reason for PG to apologize to investors?
 
"Starting with a piece of technology developed by Intel (INTC.O), opens new tab called OneAPI, the UXL Foundation, a consortium of tech companies, plans to build a suite of software and tools that will be able to power multiple types of AI accelerator chips, executives involved with the group told Reuters. "
Do you understand what CUDA does versus what OneAPI does? (Getting computer architecture analysis from Reuters is not a good idea.)
@blueone "Not nonsense to me. Lack of winning strategies in datacenter AI and datacenter networking,"
Isn't that one of the reasons you cited as the reason for PG to apologize to investors?
I didn't say PG should apologize to investors, I just "liked" @XYang2023 's post that did. Investors actually cheered, apparently, after PG was appointed CEO, because the share price went up considerably. Though I thought PG was a poor candidate for CEO, I bought a considerable block of stock at the time anyway, because I figured investors would fall for the PG hype and the share price would go up. I wasn't prescient enough to sell at the peak, I thought the PG enthusiasm (however misguided) would last longer, but I still did pretty well. Pat was always a great sales guy to the non-experts.
 
Not correct. Not correct that OneAPI is directly comparable to CUDA, and not correct that lack of OneAPI support is the reason why Gaudi has not been successful. OneAPI, while innovative and useful for making multi-platform code practical, is not a huge success on its own either.

This is the latest Gaudi software architecture:


View attachment 3223

I'm basing my opinion on my analysis of Gaudi's microarchitecture.

I don't know of any investors who made an AI bet with Intel. PG supported the bet on AI accelerators in x86 CPUs, like AMX and AVX-512. Typical PG.
Uh here is one. AMD used to still make GPUs optimized for double precision compute, and Intel has this Gaudi thing which looks much better on paper to challenge Nvidia.

It's perplexing why it is so hard to get the software to work. I worked with Google TPUs to train and serve all kinds of models (traditional and LLM) using tensorflow. It is a bit more trouble than using Nvidia GPU, but nothing a few lines of config won't fix. Those TPUs' software aren't taking that many people to support either.

I also read some blog on how to use Intel ARC with LLAMA. It's contrived, but it did work reasonably in the end (better than AMD's customer GPU in tokens per second). In my head, that process will take no more than half year to fix for a capable software engineer and Intel might have a thriving local LLM ecosystem.

Somewhat confusingly, both seem to go nowhere.

I have no knowledge on how Intel's OneAPI works, it's easy to pick the accelerator to use in pytorch and tensorflow. So not sure why some other interface layers are needed.
 
Do you understand what CUDA does versus what OneAPI does? (Getting computer architecture analysis from Reuters is not a good idea.)
I don't and I only just relied on reading articles like https://www.intel.com/content/www/u...news/unified-acceleration-uxl-foundation.html
and reading about what people in the industry said online. So, what is Intel's version of CUDA or that they are promoting instead of CUDA. AMD have their ROCm, what is Intels, if you know please. Do they even have their own GPU programming language?

Even AI misled me!
1748877607985.png

I didn't say PG should apologize to investors, I just "liked" @XYang2023 's post that did.
Then what are we debating here 🤔! You responded to my response to @XYang2023's comment, thereby implying that it made sense for PG to apologize to Investors. Anyway, we have to agree to disagree here then.
 
But 13th/14th gen issue was a very difficult issue to root cause based on what I have read and Tech media jumped on it and over sensationalized it. GN especially with the nonsense about via oxidation etc. Again CCG!

Not that hard to root cause. But very hard to publicly admit because of the liability issues. I'm sure Intel understands this (and the via issue) very well internally.
 
Not that hard to root cause. But very hard to publicly admit because of the liability issues. I'm sure Intel understands this (and the via issue) very well internally.
Well, they ended up taking the liability of extending the warranty by additional 2 years and replacing\refunding all the damaged CPUs in RMAs. But they specifically said, Via oxidation was not the root cause of this but that generates clicks and likes in YouTube. The whole saga definitely hurt Intel's image for sure.

But here is Jon Masters who despise Intel & x86 (based on his twitter posts) congratulate Intel engineers who debugged this issue.
1748880763717.png
 
Well, they ended up taking the liability of extending the warranty by additional 2 years and replacing\refunding all the damaged CPUs in RMAs. But they specifically said, Via oxidation was not the root cause of this but that generates clicks and likes in YouTube. The whole saga definitely hurt Intel's image for sure.

But here is Jon Masters who despise Intel & x86 (based on his twitter posts) congratulate Intel engineers who debugged this issue.
View attachment 3226

Intel first blamed their users ("overclocking gamers" -- then changed the story and blamed their board manufacturing partners for the problem. A couple of month later -- they admitted that it was their problem. (only after they were getting cooked with independent information on the internet and through Youtube videos describing the problem)

When they released they software "fix" they clearly stated that if users already had the problem -- the software would not fix it. Clearly, Intel understood the root cause of the problem well by then.

Intel does extensive system-based (functional) testing of components. (both before shipping and for RMAs) This testing would have clearly showed the problem.

The via oxidation issue was earlier. Intel has never come clean about what were the affected lots & specific parts that had that issue. (even though they implemented a fix -- and is was simple for them to identify the parts that had this problem)
 
Last edited:
I don't and I only just relied on reading articles like https://www.intel.com/content/www/u...news/unified-acceleration-uxl-foundation.html
and reading about what people in the industry said online. So, what is Intel's version of CUDA or that they are promoting instead of CUDA. AMD have their ROCm, what is Intels, if you know please. Do they even have their own GPU programming language?

Even AI misled me!
View attachment 3225
First of all, your post is an excellent example of why using an LLM trained on the general internet to answer pretty much any question is risky, unless you have enough expertise to know when they're hallucinating. I recently used Google's own LLM (Gemini) to point me to the differences between a Google Nest Thermostat Gen 3 and Gen 4, and I got hallucinations in return. On a Google LLM about Google products! I'm a believer in the future potential of AI to change the world, but current LLMs could make me want to short the entire AI industry.

A few months ago a friend asked me essentially the same question you've asked above, and my bullet point outline of the differences between Intel's GPU MAX software stack and Nvidia's for CUDA on their datacenter GPUs looked like the table of contents for a fairly long book. Let me think about a concise answer for a bit.
 
These are the two software stacks, for CUDA and for GPU MAX.

1748905804156.png

1748905837829.png

These two diagrams do not have equivalent functionality included; Intel's includes management software and development & tuning tools. Nvidia has these, but chose to keep their diagram more focused on the application run-time path.

Both use open specification software layers to abstract the SIMD parallel processing, and the proprietary instruction sets, that GPUs use. Intel uses the SYCL-based OneAPI compilers and run-time software, while CUDA uses the similar but different OpenCL layer. Both allow the use of C++, but OpenCL allows the use of C also. OneAPI is built to use a variation of C++ Intel developed called Data Parallel C++ (DPC++). OpenCL defines a user-managed memory model, while OneAPI provides memory management via run-time libraries which run in a Linux driver. The OpenCL strategy can provide better performance for sophisticated programmers, but the OneAPI strategy eases application development, and reduces the probability of bugs due to application memory corruption.

CUDA also supports direct use of the CUDA driver via C applications and user-controlled parallelism and memory management. Intel does this too, that's what the OneAPI Level 0 reference is, but Intel seems to hide the capability in obscurity rather than make it a well-defined capability like Nvidia does. It is thought that much of DeepSeek's performance improvements over other LLMs were due to reducing memory copies and networking stack improvements via custom programming using the CUDA driver interface, bypassing OpenCL.

Also note that Intel talks about their expertly written performance libraries for various common math and other functions. Nvidia doesn't mention the equivalents in the documentation I've seen. (Perhaps I just haven't looked thoroughly enough.)


So the biggest difference between GPU MAX and CUDA GPUs is that Intel wants you to use DPC++ and custom libraries to write applications and then use the OneAPI runtime and tools to run them. While CUDA has a similar capability with OpenCL, the real richness of their interface is programming to the CUDA driver directly, in C.

Gaudi does not use a GPU architecture, nor is a sea of programmable logic like an FPGA, so I suspect the Gaudi architecture would be more difficult to fit into OneAPI or OpenCL, if it's practical at all. Gaudi's development and runtime environment is another discussion altogether, and includes hardware-specific VLIW compilers, and specialized development tools and libraries for their unique accelerator architecture.

1748908052225.png



I've probably put everyone to sleep by now...
 
Back
Top