A few months ago, I posted a piece about PLDA expanding its support for two emerging protocol standards: CXL™ and Gen-Z™. The Compute Express Link (CXL) specification defines a set of three protocols that run on top of the PCIe PHY layer. The current revision of the CXL (2.0) specification runs with the PCIe 5.0 PHY layer at a maximum link rate of 32GT/s per lane. There are a lot of parts to this specification and multiple implementation options, so a comprehensive support package will significantly help adoption. This is why PLDA brings flexible support for compute express link (CXL) to SoC and FPGA designers.
The three previously mentioned protocols that make up CXL are:
- CXL.io: which is very similar to traditional PCIe and is responsible for discovery, configuration, and all the other things that PCIe is responsible for
- CXL.cache: which gives CXL devices coherent, low latency access to shared host memory
- CXL.mem: which gives the host processor access to shared device memory
CXL defines three types of devices that leverage different combinations of these protocols depending on the use case.
As shown in Figure 1 above, a Type 1 device combines CXL.io + CXL.cache channels. Typical Type 1 devices may include PGAS NICs (with shared global address space) or NICs with atomics. Figure 2 illustrates a Type 2 device combining all three channels. Type 2 devices may include accelerators with memory such as GPUs and other dense computation devices. Figure 3 shows a Type 3 device with CXL.io and CXL.mem channels. A typical Type 3 device may be used for memory bandwidth expansion or memory capacity expansion with storage-class memory.
The goal of CXL is to maintain memory coherency between the CPU memory space and memory on attached devices, which improves performance and lowers complexity and cost. CXL.cache and CXL.mem support this strategy. To implement CXL into a complex SoC, an interface will be required to transfer packets between the user application and the protocol controller. Various interconnect technologies are available:
AMBA AXI: is a parallel high performance synchronous, high frequency, multi-master, multi-slave communication interface which is mainly designed for on-chip communication. It has been widely used across the industry and for many projects. The AMBA® AXI™ protocol is typically chosen to reduce time to market and ease integration.
CXL-cache/mem Protocol Interface (CPI): CPI allows mapping of different protocols on the same physical wires. The spec is a public-access protocol which has been defined by Intel and totally fits with the CXL spec. It is designed for CXL.cache and CXL.mem and allows mapping of CXL.cache and CXL.mem on the same wires. It is a lightweight low-latency protocol.
AMBA CXS: is a streaming protocol that enables the transmission of packets with high bandwidth between theuser application and the protocol controller. Via the CXS interconnect, the designer can bypass the controller’s transaction layer which can reduce the latency. CXS specifications have been designed by Arm, to be implemented seamlessly with Arm-based System-on-Chip solutions.
Each of these interfaces has its own benefits and use cases.
Below are some implementation examples:
- Option 1 (Figure 4): The designer chooses CPI for cache & mem channels. This is the most generic option providing the lowest latency and highest flexibility. It allows designers to implement custom memory and cache management that may be independent from the CPU architecture
- Option 2 (Figure 5): The designer chooses CPI for the cache channel and AMBA AXI for the mem channel. This option allows for custom cache management while configuration and memory management are managed by the CPU subsystem via the NoC. It can be an interesting option for prototyping CXL.mem on an SoC or FPGA with built-in AMBA AXI interconnect
- Option 3 (Figure 6): The designer chooses CXS. This option is specific to Arm-based SoCs and allows seamless connection to the Arm CoreLink Coherent Mesh Network interconnect and Arm CPU subsystems. It allows support for coherent communication via CXL (to the CPU), and CCIX (to the accelerators)
PLDA has designed a highly flexible IP to meet all the needs of CXL implementation scenarios in a complex SoC or FPGA. Flexibility is a fundamental part of the DNA of PLDA, and the company has deep domain expertise in PCIe. So, XpressLINK-SOC naturally fits in the roadmap to support designers who need to implement CXL in a complex design. This parameterized soft IP product supports all the device types and many interconnect options.
- The AMBA AXI Protocol Specification for CXL.io traffic
- Either the Intel CXL-cache/mem Protocol Interface (CPI), the AMBA CXS Interface or the AMBA AXI Protocol Specification for CXL.mem
- Either a CPI interface or the AMBA CXS Protocol Specification for CXL.cache traffic