Apple introduces M2 Ultra

blueone · Jun 8, 2023

Tanj said:
Only temporarily. It is a good bet that Nvidia wants to grow into server markets including workstations, and the heavily video oriented Mac Pro market is a logical place to start. I'm always looking ahead.

Maybe. I was just commenting the current state of both M2 and GH.

blueone · Jun 8, 2023

Tanj said:
DDR5 DIMMs have 1 channel per 8 chips. LPDDR5 has a channel for each chip. DRAM core arrays are closely similar in each generation from each manufacturer, but the whole point of DDR, LPDDR, GDDR, HBM, is the different interfaces wrapped around the arrays for different tradeoffs when interfacing to the host. Apple has doubled down on extracting the performance potential of LPDDR, which explains a lot of their power/perf advantage.

Servers typically have around 8GB per dual-thread core. The M2 Ultra has 8GB per single-thread, though it is sharing with a GPU (like future servers will need to do, as inferencing becomes ubiquitous). The remaining difference is RAS. I expect the next generation of LPDDR will go beyond single bit correction and provide better probity, at which DDR may lose a lot of its market share.

I've wondered if other CPU designers would follow Apple into the DRAM-in-package realm. Perhaps Intel is, sort of. There's an Intel Fellow I saw on LinkedIn recently with the position description of "Director, New In-Package Memory Technologies".

blueone · Jun 8, 2023

Tanj said:
Servers typically have around 8GB per dual-thread core. The M2 Ultra has 8GB per single-thread, though it is sharing with a GPU (like future servers will need to do, as inferencing becomes ubiquitous). The remaining difference is RAS. I expect the next generation of LPDDR will go beyond single bit correction and provide better probity, at which DDR may lose a lot of its market share.

Servers have widely-varying DRAM requirements though. In-memory database servers, for example, often have over a terabyte of DRAM or more, and I hear from a friend at AMD that Genoas are in high demand for these applications because they can support up to 6TB of physical memory (they kick Sapphire Rapids butt in memory capacity). Genoa also has 52bit physical memory addressing. I haven't read yet what Sapphire Rapids physical address length is, but I do remember Ice Lake was at a surprisingly low 46bit physical address length.

Xebec · Jun 8, 2023

The Mac Studio seems to be a great use case for the M2 Ultra.

The Mac Pro OTOH seems fairly disappointing:

- 192GB max non-expandable RAM vs. 1536GB (1.5TB) of fairly old Intel Mac Pro (Cascade Lake-X)
- they’re using multiple PCIe switches to split out the PCIe slots. The M2 Ultra SoC only seems to have 16-20 total PCIe lanes based on sources I’ve seen.

This doesn’t make the product terrible but at $6,999 starting price that’s not very impressive.

I hope we see some TSMC N3 silicon in Apple products this fall.

Mooredaddy · Jun 8, 2023

Xebec said:
The Mac Studio seems to be a great use case for the M2 Ultra.

The Mac Pro OTOH seems fairly disappointing:

- 192GB max non-expandable RAM vs. 1536GB (1.5TB) of fairly old Intel Mac Pro (Cascade Lake-X)
- they’re using multiple PCIe switches to split out the PCIe slots. The M2 Ultra SoC only seems to have 16-20 total PCIe lanes based on sources I’ve seen.

This doesn’t make the product terrible but at $6,999 starting price that’s not very impressive.

I hope we see some TSMC N3 silicon in Apple products this fall.

Same. Once we see a MacBook Pro M3 on N3 I’m going to finally be upgrading my 2015 MacBook. It’s served me well but and im excited to see how much things have improved.

Tanj · Jun 9, 2023

blueone said:
I hear from a friend at AMD that Genoas are in high demand for these applications because they can support up to 6TB of physical memory (they kick Sapphire Rapids butt in memory capacity). Genoa also has 52bit physical memory addressing. I haven't read yet what Sapphire Rapids physical address length is, but I do remember Ice Lake was at a surprisingly low 46bit physical address length.

I'm pretty sure no-one has built a Genoa with 6TB of DDR6. They have 12 channels and the typical layout is 1 DIMM slot per channel if you want to run at DDR5-5200 or faster. So, to do 3TB with 12 channels you will need 256GB DIMMs. Expensive and rare. Does someone do that? Probably. "High demand"? seems a stretch.

SPR has 52 bit physical, 57 bit virtual. 52 bits is 4 petabytes.

Sapphire Rapids scales to 8 sockets without glue, and there are systems with more sockets built with "glue". I expect huge memory systems will remain mostly Intel territory. Both companies scale at 1 memory channel for roughly 8 cores, and with Intel multisocket they can get to over 400 cores in a coherent address space without glue, then more if you want to pay for the specialists to build you a really big box.

Tanj · Jun 9, 2023

Xebec said:
- they’re using multiple PCIe switches to split out the PCIe slots. The M2 Ultra SoC only seems to have 16-20 total PCIe lanes based on sources I’ve seen.

Sources? The speculations I have seen are based on how many PCIe are in other Apple SKUs with the M2 Max chip, but that ignores that Apple may choose not to use all the lanes available from the chip.

blueone · Jun 9, 2023

Tanj said:
I'm pretty sure no-one has built a Genoa with 6TB of DDR6. They have 12 channels and the typical layout is 1 DIMM slot per channel if you want to run at DDR5-5200 or faster. So, to do 3TB with 12 channels you will need 256GB DIMMs. Expensive and rare. Does someone do that? Probably. "High demand"? seems a stretch.

I can't say.

Tanj said:
SPR has 52 bit physical, 57 bit virtual. 52 bits is 4 petabytes.

I finally found that on the Intel website last night. Thanks.

Tanj said:
Sapphire Rapids scales to 8 sockets without glue, and there are systems with more sockets built with "glue". I expect huge memory systems will remain mostly Intel territory. Both companies scale at 1 memory channel for roughly 8 cores, and with Intel multisocket they can get to over 400 cores in a coherent address space without glue, then more if you want to pay for the specialists to build you a really big box.

This is a nit, but for the 8-socket scalable version it looks like Intel supports only 48 cores per socket, so that's "only" 384 cores.

There are a bewildering number of SKUs with varying core counts, UPI port counts, and clock frequencies that are available in single socket, 2S capable, or 8S capable configurations. For those who haven't seen this list of Sapphire Rapids Xeon CPUs, it is interesting:

Products formerly Sapphire Rapids

ark.intel.com

The numerous workstation versions are also interesting to compare to the M2 Ultra. Making an application-level comparison is impossible just by the specs. It'll be interesting to see if there are any side-by-side comparisons that go beyond the low-level benchmarks.

Paul2 · Jun 9, 2023

blueone said:
I've wondered if other CPU designers would follow Apple into the DRAM-in-package realm. Perhaps Intel is, sort of. There's an Intel Fellow I saw on LinkedIn recently with the position description of "Director, New In-Package Memory Technologies".

HBM

Paul2 · Jun 9, 2023

blueone said:
One thing that does confuse me about the M2 Ultra... is all of the DRAM in-package as with the other M-series CPUs, or does it use external DRAM? I'm having a trouble believing 192GB of DRAM and all of that Apple silicon fits in a single package. Perhaps I'm just lacking imagination.

Looks like the 8 stack LPDDR, so 8 chips, with 8 dies

Tanj · Jun 9, 2023

Paul2 said:
Looks like the 8 stack LPDDR, so 8 chips, with 8 dies

12 chips per stack. 2GB per chip, 96 chips total. That is a whole lot of parallel activity, with 16 banks per chip, 1440 commands in flight at any one moment.

blueone · Jun 9, 2023

Paul2 said:
HBM

Yup. I totally forgot about Xeon Max. Thanks.

Xebec · Jun 9, 2023

Tanj said:
Sources? The speculations I have seen are based on how many PCIe are in other Apple SKUs with the M2 Max chip, but that ignores that Apple may choose not to use all the lanes available from the chip.

Thread here :

https://twitter.com/x/status/1665857496081219587

(Broadcom)

“64 downstream, 16 upstream, so 80 total.That means it'll be the PEX88080 (82-lane) or PEX88096 (98-lane).“

Search

Apple introduces M2 Ultra

blueone

Well-known member

blueone

Well-known member

blueone

Well-known member

Xebec

Well-known member

Mooredaddy

Well-known member

Tanj

Well-known member

Tanj

Well-known member

blueone

Well-known member

Products formerly Sapphire Rapids

Paul2

Well-known member

Paul2

Well-known member

Tanj

Well-known member

blueone

Well-known member

Xebec

Well-known member