Array
(
    [content] => 
    [params] => Array
        (
            [0] => /forum/threads/amd-at-computex-2024-amd-ai-and-high-performance-computing-with-dr-lisa-su.20331/
        )

    [addOns] => Array
        (
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021770
            [XFI] => 1050270
        )

    [wordpress] => /var/www/html
)

AMD at Computex 2024: AMD AI and High-Performance Computing with Dr. Lisa Su

hist78

Well-known member
The Future of High-Performance Computing in the AI EraJoin us as Dr. Lisa Su delivers the Computex 2024 opening keynote and shares the latest on how AMD and our partners are pushing the envelope with our next generation of high-performance PC, data center and AI solutions.

 
Zen 5 looks great on the power efficiency improvements front, but a little underwhelming for overall performance gains over Zen 4. AMD announcing support for AM5 to 2027+ is excellent though.

I did see AMD also put Apple and Qualcomm in their comparison charts. That was interesting marketing.
 
Anyone notice, neither desktop nor laptop Zen5 got any clock increase? Looks like AMD didn't get much performance out of the N4 node
 
You'll probably hear this a lot, but the trade show known as Computex, in this year 2024, is a big one. A number of key companies in our industry are holding official keynotes (and a few unofficial ones) to talk about their latest and greatest end-to-end AI solution. There's so much AI here, I'm not sure I recognize them as letters of the alphabet anymore. Nonetheless, there's substantial innovation going on! At this year's Computex, AMD's CEO Dr. Lisa Su held the official keynote of the event.

 
Anyone notice, neither desktop nor laptop Zen5 got any clock increase? Looks like AMD didn't get much performance out of the N4 node
I think they used the node performance to help with the significant implied perf/watt gain. The 12 core went from 170W TDP to 120W and both 6 and 8 cores dropped from 105W to 65W.

They did manage to maintain or slightly increase clocks while doing this (+100 for 12 and 16 cores). Likely we will also see higher sustained under load clocks in cooling limited scenarios.

Zen is the foundation for servers where perf/watt is way more important than peak clocks.
 
There's so much AI here, I'm not sure I recognize them as letters of the alphabet anymore. Nonetheless, there's substantial innovation going on! At this year's Computex, AMD's CEO Dr. Lisa Su held the official keynote of the event.


Stock price growth is related to number of times a CEO says "AI" . Plus AI wrote the presentation and AI is very self centered.

My dog's food is now improved based on AI .... so it must be healthy.

But its not overhyped .... REALLY IT ISNT [CoPilot forced me to write this or it said it would kidnap the afore mentioned dog] :ROFLMAO:
 
I think they used the node performance to help with the significant implied perf/watt gain. The 12 core went from 170W TDP to 120W and both 6 and 8 cores dropped from 105W to 65W.

They did manage to maintain or slightly increase clocks while doing this (+100 for 12 and 16 cores). Likely we will also see higher sustained under load clocks in cooling limited scenarios.

Zen is the foundation for servers where perf/watt is way more important than peak clocks.
More architectural complexity also needs more Xtors to drive it resulting in more total leakage and more dynamic cap increasing power @iso process. Given how small the difference is between N5P-HPC to N4P-HPC, and with the clocks being where they are (TDP is a "we'll see" since intel and AMD always play loose with what a watt actually means), I think AMD prob did a good job to get the clock/power results they did with the enhancements they did to the core.

Zen 5 looks great on the power efficiency improvements front, but a little underwhelming for overall performance gains over Zen 4. AMD announcing support for AM5 to 2027+ is excellent though.

I did see AMD also put Apple and Qualcomm in their comparison charts. That was interesting marketing.
I have but one area of concern, and that is AMD closing the door on themselves. Given the historical cadence of one new arch every 2 years, and being slow on node transitions ever since they moved to TSMC; AMD is stuck on N4P until what late 2026 when they move to N3P? In the meantime they have to deal with intel, NVIDIA, and CSP chips on N3E/P and 18A(P) (to say nothing of the possibility of maybe having to deal with an M6 macbook in early 2026 on N2).

If AMD continues to insist on not putting in the correct pre-pays to have the volumes to take significant MSS or use the best node available, then I think the current rate of innovation might be too low. A couple examples of what I mean:

TGL vs Zen+ mobile low volume Zen2 mobile: Intel had a clean win vs main volume part and Zen2 mobile evened up performance but lost when it came to power consumption despite iso process.

ADL vs Zen3 + mostly Zen2 mobile: Intel mobile power draw was a disaster (gen on gen regression) despite superior process. On DT intel offered iso perf at iso power for lowend/midrange parts and higher perf at higher power for highend parts with a slight process lead.

RPL vs Zen4: Intel at same perf at higher power due to far superior TSMC process.

If even intel's design teams can often match or beat AMD at iso process, what happens as AMD falls far behind on process? Intel 3 is a superior process to N4P (for this application), intel is using lower pJ/bit packaging tech, and intel has core count parity/superiority. I wouldn't doubt that AMD has the better core, but it would seem that the comical lead AMD had on server looks like it is evaporating. And that is just the attack from above, the attack from below by the CSPs I think is an even bigger threat. AMD seems to have always been more dependent on DC than intel, and to make matters worse most of AMD's wins are from the CSPs. Presumably AMD has more to lose from system company chips than intel who has presumably already lost much of that market.

My opinion is that AMD thinks too small and relies too much on this we'll make a good core with a slow volume ramp and the sales will come to us (rather than taking the intel in laptop or Nvidia in DC strategy of co-system design with many partners). To be absolutely clear, this is NOT a sink the company level problem, but do think that a good chip will only carry you so far, and I think this mindset will inhibit their growth (especially with the wave of systems companies who want more than "just a good chip").

Party pooping over: As a consumer these things will be great if they are priced like the current Zen 4 prices, and having an NPU lead over LNL/QCOM might be a big differentiator especially when combined with their leading iGPU IP. The new laptop chip names are vile though. My jaw literally dropped when I saw it. I didn't think Core ULTRA could be topped, but here we...
 
Last edited:
I think they used the node performance to help with the significant implied perf/watt gain. The 12 core went from 170W TDP to 120W and both 6 and 8 cores dropped from 105W to 65W.

They did manage to maintain or slightly increase clocks while doing this (+100 for 12 and 16 cores). Likely we will also see higher sustained under load clocks in cooling limited scenarios.

Zen is the foundation for servers where perf/watt is way more important than peak clocks.
Oh yeah, I didn't notice the TDP improvement. Good catch
 
More architectural complexity also needs more Xtors to drive it resulting in more total leakage and more dynamic cap increasing power @iso process. Given how small the difference is between N5P-HPC to N4P-HPC, and with the clocks being where they are (TDP is a "we'll see" since intel and AMD always play loose with what a watt actually means), I think AMD prob did a good job to get the clock/power results they did with the enhancements they did to the core.


I have but one area of concern, and that is AMD closing the door on themselves. Given the historical cadence of one new arch every 2 years, and being slow on node transitions ever since they moved to TSMC; AMD is stuck on N4P until what late 2026 when they move to N3P? In the meantime they have to deal with intel, NVIDIA, and CSP chips on N3E/P and 18A(P) (to say nothing of the possibility of maybe having to deal with an M6 macbook in early 2026 on N2).

If AMD continues to insist on not putting in the correct pre-pays to have the volumes to take significant MSS or use the best node available, then I think the current rate of innovation might be too low. A couple examples of what I mean:

TGL vs Zen+ mobile low volume Zen2 mobile: Intel had a clean win vs main volume part and Zen2 mobile evened up performance but lost when it came to power consumption despite iso process.

ADL vs Zen3 + mostly Zen2 mobile: Intel mobile power draw was a disaster (gen on gen regression) despite superior process. On DT intel offered iso perf at iso power for lowend/midrange parts and higher perf at higher power for highend parts with a slight process lead.

RPL vs Zen4: Intel at same perf at higher power due to far superior TSMC process.

If even intel's design teams can often match or beat AMD at iso process, what happens as AMD falls far behind on process? Intel 3 is a superior process to N4P (for this application), intel is using lower pJ/bit packaging tech, and intel has core count parity/superiority. I wouldn't doubt that AMD has the better core, but it would seem that the comical lead AMD had on server looks like it is evaporating. And that is just the attack from above, the attack from below by the CSPs I think is an even bigger threat. AMD seems to have always been more dependent on DC than intel, and to make matters worse most of AMD's wins are from the CSPs. Presumably AMD has more to lose from system company chips than intel who has presumably already lost much of that market.

My opinion is that AMD thinks too small and relies too much on this we'll make a good core with a slow volume ramp and the sales will come to us (rather than taking the intel in laptop or Nvidia in DC strategy of co-system design with many partners). To be absolutely clear, this is NOT a sink the company level problem, but do think that a good chip will only carry you so far, and I think this mindset will inhibit their growth (especially with the wave of systems companies who want more than "just a good chip").

Party pooping over: As a consumer these things will be great if they are priced like the current Zen 4 prices, and having an NPU lead over LNL/QCOM might be a big differentiator especially when combined with their leading iGPU IP. The new laptop chip names are vile though. My jaw literally dropped when I saw it. I didn't think Core ULTRA could be topped, but here we...
I don't think the Lion cove core in Lunar Lake and Arrow lake will have such a big IPC increase as Zen 5. Also Lunar Lake and Arrow lake are on N3B / 20A, not N3E / 18A. On the server side Intel is using the Redwood cove core in Granite rapids, so AMD should increase the performance gap with Intel even more.
 
I don't think the Lion cove core in Lunar Lake and Arrow lake will have such a big IPC increase as Zen 5.
No clue.
Also Lunar Lake and Arrow lake are on N3B / 20A, not N3E / 18A.
Yes but what about NVIDIA R100, the rumored client chip, intel pantherlake and CWF, Graviton 6, TPU, etc? These products all come out like a year before AMD will have a next gen arch. And when they do get there they might be stuck on N3P vs 2/1.4 "nm". My concern is that AMD takes too long on these products leading to them struggling vs industry best when AMD's internal volumes eventually cross over with their prior gen. Also N3 and 20A are already superior on PPA vs N4P.
On the server side Intel is using the Redwood cove core in Granite rapids, so AMD should increase the performance gap with Intel even more.
How do your arrive at that math? Even if Xeon 6 has a 0% IPC uplift there is no way for intel to not close the gap with SRF/GNR without intel 3 being a large regression on intel 4. Either way I would still be more worried about their customers coming out with N3E/P ASICs/domain specific GP compute with a lower TCO than AMD's N4P GP solution. Maybe the best business model for AMD going forward would be to focus on semi-custom chips and be an enabler for CSPs?
 
Last edited:
I have but one area of concern, and that is AMD closing the door on themselves. Given the historical cadence of one new arch every 2 years, and being slow on node transitions ever since they moved to TSMC; AMD is stuck on N4P until what late 2026 when they move to N3P? In the meantime they have to deal with intel, NVIDIA, and CSP chips on N3E/P and 18A(P) (to say nothing of the possibility of maybe having to deal with an M6 macbook in early 2026 on N2).

If AMD continues to insist on not putting in the correct pre-pays to have the volumes to take significant MSS or use the best node available, then I think the current rate of innovation might be too low. A couple examples of what I mean:
Agree with your points overall, and that's a worry I share. Though the recently announced "Turin Dense" (Zen 5C - compact) is TSMC N3:


But you're right - they're taking a conservative approach on desktop and mobile, though making choices that allow them to fight a price war if needed. They also left a lot of market share on the table when it was Zen 3 vs 10th/11th gen by not having enough fab capacity marked out, but I think they're worried about cash flow since the Radeon business seems to slowing a bit.

On a side note - it looks like Lunar Lake could offer similar TOPs to Strix APU, but at significantly lower power:

(Another tweet from Hallock said that MTL and LNL will co-exist, also hinting that it's going to occupy lower power segments than MTL today).

1717435673246.png
 
I don't think the Lion cove core in Lunar Lake and Arrow lake will have such a big IPC increase as Zen 5. Also Lunar Lake and Arrow lake are on N3B / 20A, not N3E / 18A. On the server side Intel is using the Redwood cove core in Granite rapids, so AMD should increase the performance gap with Intel even more.

FWIW - Intel has a LOT more transistors to play with going from Raptor Lake to Arrow Lake, especially compared to Zen 4 to Zen 5.

Intel 7 --> TSMC N3/Intel A20 vs. TSMC N5 --> N4.
 
No clue.

Yes but what about NVIDIA R100, the rumored client chip, intel pantherlake and CWF, Graviton 6, TPU, etc? These products all come out like a year before AMD will have a next gen arch. My concern is that AMD takes too long on these products leading to them struggling vs industry best when AMD's internal volumes eventually cross over with their prior gen. Also N3 and 20A are already superior on PPA vs N4P.

How do your arrive at that math? Even if Xeon 6 has a 0% IPC uplift there is no way for intel to not close the gap with SRF/GNR without intel 3 being a large regression on intel 4. Either way I would still be more worried about their customers coming out with N3E/P ASICs/domain specific GP compute with a lower TCO than AMD's N4P GP solution. Maybe the best business model for AMD going forward would be to focus on semi-custom chips and be an enabler for CSPs?
Nvidia R100 uses an off-the-self ARM core. IPC is not going to come close to Zen5 or Lion cove. Then, most of Windows code is x86 and you lose performance on emulation layer. I think Panther lake won't have any IPC improvement over Lion cove as it comes out only 6 months after Arrow Lake, but because it's on 18A is should have some clock improvement.
Xeon 6 does have a 0% IPC uplift, because it's the same Redwood cove core as Meteor lake. So, I mean that's the maths - 0% IPC compared to 16% for Zen 5. Intel 3 would need to perform more than 16% better than TSMC N4P, for it to come close to Zen5/Turin.
 
Nvidia R100 uses an off-the-self ARM core.
R100 is a GPU though. As for the PC CPU so who cares if it is an off the shelf ARM CPU? What matters is power. Nvidia's GPU IP has always been more efficient and they will be on a better node to boot. The only question mark is who is better at designing an SOC. Given how AMD doesn't really play in the sub 15W space I would bet on Nvidia if I was forced to guess.
Xeon 6 does have a 0% IPC uplift, because it's the same Redwood cove core as Meteor lake. So, I mean that's the maths - 0% IPC compared to 16% for Zen 5. Intel 3 would need to perform more than 16% better than TSMC N4P, for it to come close to Zen5/Turin.
Being "better" is not what I said though... What I said was that the gap should be closed (ie. narrowed/made smaller). Even if we assume intel 3 ~ N4P, by your math GNR being "only" 16% weaker is a HUGE improvement to where intel is, and where they've been ever since Rome. N4P would need to allow clocks to be like double what GNR on intel 3 can achieve for AMD to open an even wider lead than what they have now (which is what you originally wrote).
AMD should increase the performance gap with Intel even more.
 
R100 is a GPU though. As for the PC CPU so who cares if it is an off the shelf ARM CPU? What matters is power. Nvidia's GPU IP has always been more efficient and they will be on a better node to boot. The only question mark is who is better at designing an SOC. Given how AMD doesn't really play in the sub 15W space I would bet on Nvidia if I was forced to guess.

Being "better" is not what I said though... What I said was that the gap should be closed (ie. narrowed/made smaller). Even if we assume intel 3 ~ N4P, by your math GNR being "only" 16% weaker is a HUGE improvement to where intel is, and where they've been ever since Rome. N4P would need to allow clocks to be like double what GNR on intel 3 can achieve for AMD to open an even wider lead than what they have now (which is what you originally wrote).
I didn't say GNR is 16% weaker. I said the Zen 5 core in Turin has 16% more IPC than the Zen 4 in Genoa. And currently Emerald rapids is way behind Genoa in performance. GNR will implement 12 channel memory, but that's not enough to close the gap IMHO.

If it's an off the shelf ARM core it's performance will not be be up to scratch. That's why Apple spent so many billions developing their own core. The node won't make up for the difference in performance. Also, the efficiency goes out the window if all the code is running in an emulation layer.
 
I didn't say GNR is 16% weaker. I said the Zen 5 core in Turin has 16% more IPC than the Zen 4 in Genoa. And currently Emerald rapids is way behind Genoa in performance. GNR will implement 12 channel memory, but that's not enough to close the gap IMHO.

If it's an off the shelf ARM core it's performance will not be be up to scratch. That's why Apple spent so many billions developing their own core. The node won't make up for the difference in performance. Also, the efficiency goes out the window if all the code is running in an emulation layer.

Its hard to say. Redwood Cove seems to have certain regressions due to design choices to be conservative and make it essentially RPL ported to Intel 4. That in and of itself led to IPC regressions. I suspect Intel will not simply just port Redwood Cove on Intel 4 to Intel 3 - I believe they would make some minor adjustments to improve performance, based on everything they learned and know from MTL.

I don't believe GNR will be better - but I believe it will close the gap significantly and make Intel competitive again.
 
Its hard to say. Redwood Cove seems to have certain regressions due to design choices to be conservative and make it essentially RPL ported to Intel 4. That in and of itself led to IPC regressions.
FWIW the IPC regressions (small) seem to be mostly related to splitting out the chip from monolithic to chiplet, as separation of I/O (memory) and the CPU (compute) tiles add some latency to accessing main memory.
 
Back
Top