Tech AMD’s Lisa Su has already vanquished Intel. Now she’s going after Nvidia

XYang2023 · Apr 2, 2025

KevinK said:
That’s why I question folks proposing lower level standards - I can understand it for HPC problems, but not for GenAI inference. I think the big competitive battle going on right now is going to be about inference cost / power per token at the data center level, for every leading model. The good news is that Llama has been added to MLPerf 5.0. The bad news is that the focus is still on performance, so they aren’t looking at cost/power per token yet.

NVIDIA Blackwell & AMD MI325X Showdown In Latest MLPerf Inference Benchmarks: B200 Shatters Records, Instinct Fights Against Hopper

NVIDIA & AMD have submitted latest MLPerf Inference performance benchmarks of their latest GPUs, including Blackwell B200 & Instinct MI325X.

wccftech.com

Gaudi 3 is not bad.

Screenshot 2025-04-03 084954_resized.png

siliconbruh999 · Apr 2, 2025

KevinK said:
That’s why I question folks proposing lower level standards - I can understand it for HPC problems, but not for GenAI inference. I think the big competitive battle going on right now is going to be about inference cost / power per token at the data center level, for every leading model. The good news is that Llama has been added to MLPerf 5.0. The bad news is that the focus is still on performance, so they aren’t looking at cost/power per token yet.

NVIDIA Blackwell & AMD MI325X Showdown In Latest MLPerf Inference Benchmarks: B200 Shatters Records, Instinct Fights Against Hopper

NVIDIA & AMD have submitted latest MLPerf Inference performance benchmarks of their latest GPUs, including Blackwell B200 & Instinct MI325X.

wccftech.com

AMD has thrown so much Hardware at the problem it's hilarious just to loose to H100 as for low level you know that part of deepseks success was low level PTX assembly they utilized hardware fully.

KevinK · Apr 2, 2025

XYang2023 said:
Gaudi 3 is not bad.

Who knows ? Random commissioned benchmarks are a bit meaningless, especially without full disclosure of comparative environments. I’m thinking that MLPerf 5.0 is a far more reliable and trustworthy, transparent comparison. Unfortunately Intel has only done the hard work for Granite Rapids, not Gaudi 3 (yet) for the new LLM Llama benchmarks.

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results - MLCommons

MLCommons' latest MLPerf Inference v5.0 results show Gen AI now the center of attention for performance engineering.

mlcommons.org

KevinK · Apr 2, 2025

siliconbruh999 said:
AMD has thrown so much Hardware at the problem it's hilarious just to loose to H100 as for low level you know that part of deepseks success was low level PTX assembly they utilized hardware fully.

Yeah, but most of DeepSeek’s efficiency magic can be duplicated via smarter data center orchestration that does GPU planning, prefill/decode disaggregation, smart KV cache management / routing and communication between GPUs. Guess who has figured out how to make that work (hint 30x tokens/sec improvement on DeepSeek-R1)

NVIDIA Dynamo

Deploy, run, and scale AI for any application on any platform.

www.nvidia.com

Open source, so others can go that route.

siliconbruh999 · Apr 3, 2025

KevinK said:
Who knows ? Random commissioned benchmarks are a bit meaningless, especially without full disclosure of comparative environments. I’m thinking that MLPerf 5.0 is a far more reliable and trustworthy, transparent comparison. Unfortunately Intel has only done the hard work for Granite Rapids, not Gaudi 3 (yet) for the new LLM Llama benchmarks.

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results - MLCommons

MLCommons' latest MLPerf Inference v5.0 results show Gen AI now the center of attention for performance engineering.

mlcommons.org

Intel is the only company that submits ML Perf Result for CPU it feels weird vs ASIC/GPUs

XYang2023 · Apr 3, 2025

siliconbruh999 said:
Intel is the only company that submits ML Perf Result for CPU it feels weird vs ASIC/GPUs

If latency is not important, you can do batch processing with CPUs.

siliconbruh999 · Apr 3, 2025

XYang2023 said:
If latency is not important, you can do batch processing with CPUs.

Won't CPU be better at latency sensitive task ?

XYang2023 · Apr 3, 2025

siliconbruh999 said:
Won't CPU be better at latency sensitive task ?

Maybe latency is not the right word. The throughputs of using CPU for inferencing are limited in comparison to accelerators. But it can produce the same outputs given enough time.

KevinK · Apr 4, 2025

Here’s an interview with Lisa where she highlights customers looking for AMD-optimized complete rack and data center level solutions.

If AI infrastructure demand is slowing, as some market watchers claim, AMDâ¦ | Brian Sozzi

If AI infrastructure demand is slowing, as some market watchers claim, AMD CEOÂ Lisa SuÂ isn't seeing it. "The need for compute continues to be immense," Su told me in a Yahoo Finance exclusive interview on Monday. "We see that throughout all of our customers globally, and we're going to...

www.linkedin.com

Search

Tech AMD’s Lisa Su has already vanquished Intel. Now she’s going after Nvidia

XYang2023

Well-known member

NVIDIA Blackwell & AMD MI325X Showdown In Latest MLPerf Inference Benchmarks: B200 Shatters Records, Instinct Fights Against Hopper

siliconbruh999

Well-known member

NVIDIA Blackwell & AMD MI325X Showdown In Latest MLPerf Inference Benchmarks: B200 Shatters Records, Instinct Fights Against Hopper

KevinK

Well-known member

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results - MLCommons

KevinK

Well-known member

NVIDIA Dynamo

siliconbruh999

Well-known member

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results - MLCommons

XYang2023

Well-known member

siliconbruh999

Well-known member

XYang2023

Well-known member

KevinK

Well-known member

If AI infrastructure demand is slowing, as some market watchers claim, AMDâ¦ | Brian Sozzi