[content] => 
    [params] => Array
            [0] => /forum/index.php?threads/intel-blog-post-comparing-ai-performance-of-xeon-vs-epyc.20417/

    [addOns] => Array
            [DL6/MLTP] => 13
            [Hampel/TimeZoneDebug] => 1000070
            [SV/ChangePostDate] => 2010200
            [SemiWiki/Newsletter] => 1000010
            [SemiWiki/WPMenu] => 1000010
            [SemiWiki/XPressExtend] => 1000010
            [ThemeHouse/XLink] => 1000970
            [ThemeHouse/XPress] => 1010570
            [XF] => 2021370
            [XFI] => 1050270

    [wordpress] => /var/www/html

Intel blog post comparing AI performance of Xeon vs Epyc


Well-known member

AMD released some benchmarks showing their next gen Epyc chip (Turin) beating Xeon 5 series (current “Intel 7” offering) by ~ 5.4x in AI benchmarks. Based on notes below, this comes from their Computex keynote.

Intel then ran their own benchmarks and is claiming that in fact the Intel 7 product with 64 cores is actually slightly ahead of the upcoming N3 Epyc chip with 128 cores. Of course the blog touts Xeon 6 with even higher performance coming soon.

I saw elsewhere (Toms HW) some speculation that this may be due to AMX instructions not being used on AMD’s internal runs while Intel used them in theirs.

The AI benchmark wars have begun…


Some fine print from Intel’s blog:

Product and Performance Notes​

15x latency reduction on LLM optimizations

1-node, 2x Intel® Xeon® Platinum 8480+ processor, 56 cores, Intel® Hypher-Threading Technology on, turbo on, NUMA 2, total memory 512 GB (16x64 GB DDR5 4800 MT/s [4800 MT/s]), AMI BIOS, microcode 3A06, 2x Ethernet Controller X710 for 10GBASE-T, 1x 1.8T Samsung* SSD 970 EVO Plus, CentOS* Stream9, 5.14.0-378.el9.x86_64, text generation on GPT-J 6B, GCC* 12.3, CPU for Intel Extension for PyTorch and PyTorch 2.1. BF16, BS1 CPI 56, input token size = 1016, output token size = 32. Tests by Intel as of September 27, 2023. Results may vary.

2 Figure 1. Intel 5th gen Intel Xeon processor with Llama2-7B performance versus the competition's next-generation processor: on 5th gen Intel® Xeon® Scalable processor (formerly code named Emerald Rapids) using: 2x Intel Xeon Platinum 8592+ processor, 64 cores, Intel Hypher-Threading Technology on, turbo on, 1024 GB (16x64 GB DDR5 5600 MT/s [5600 MT/s]), SNC disabled, Ubuntu* 22.04.4 LTS, kernel-6.5.0-35-generic, models run with PyTorch v2.3.0 and Intel Extension for PyTorch. Model instances-2, batch size 10 (128/128), batch size 2 (2048/2048), batch size 7 (1024/128), Precision: int4—weight-only quantization algorithm—GPTQ, Tested by Intel on June 9, 2024. Scripts to reproduce the performance are on GitHub.

3 The source for AMD's claim of Turin 128C and Intel Xeon 8592+ processor: Computex 2024 Keynote Video.