What is Compute Express Link?
CXL is a cache-coherent open interconnect standard for high-speed CPU connection to memory and other devices. Compute Express Link leverages the standard PCIe 5.0 physical layer and runs as a supported alternate protocol. By creating a common memory space for connected devices, the CXL standard brings performance advantages for hyperscalers and other advanced applications.
- Compute express link utilizes a flexible processor port that can operate in either PCIe 5.0 or CXL modes. Both device classes can achieve data rates of 32 GT/s or up to 64 GB/s in each direction over 16 lanes.
- The CXL Consortium was founded in 2019 by nine industry-leading organizations to develop technical specifications, support emerging use case models, and advance CXL technology development and adoption.
- Artificial intelligence (AI), machine learning, and cloud infrastructure are among the applications that benefit most from the from extremely low latency and coherent memory access provided by the CXL interface.
How Does CXL Work?
In summary, the Compute Express Link framework establishes coherency between the memory of the CPU and each connected device. This allows storage resources to be pooled and shared efficiently even as the software stack complexity is reduced. To enable memory pooling, both the host and peripheral device(s) must be CXL-enabled. Data transfer is completed using 528-bit flow control units or “flits”.
- Single level switching allows the host to fan out to multiple devices while maintaining high throughput in each direction. Resources including accelerators and any available CXL storage can be dynamically re-assigned as the server workload changes.
- The CXL 2.0 specification also includes a standardized fabric manager. This ensures a seamless user experience with consistent configurations and error reporting regardless of the pooling type, host, or usage model.
Common Compute Express Link Use Cases
As the interface has evolved, unique use cases and applications have led the CXL Consortium to define three discrete device types.
- Type 1 Devices: Accelerators and other devices that lack local memory and therefore must rely on the CPU are classified as Type 1. The CXL.io and CXL.cache protocols enable these devices to communicate and transfer memory capacity from the host processor more efficiently.
- Type 2 Devices: Products that include their own data storage capabilities but also leverage CPU memory are known as Type 2. All three CXL protocols combine to promote coherent memory sharing between these devices and the CPU.
- Type 3 Devices: Memory expanders or devices designed to augment existing CPU memory are classified as Type 3. The CXL.io and CXL.memory protocols enable the CPU to access these external sources with improved bandwidth and latency performance.
Compute Express Link Benefits
By streamlining connectivity and resource sharing, CXL technologies provide numerous enhancements that improve high-capacity workload performance while reducing system complexity and cost. These attributes become increasingly valuable as next-generation data centers and emerging technologies drive demand for faster data processing and lower total cost of ownership (TOC).
- Coherency enables CXL memory pools to remain consistent with respect to data validity. This allows for faster and more efficient resource sharing between devices and processors.
- Heterogeneous architecture, combining processors of varying types and generations, is fully accommodated by the CXL standard. This is especially useful for complex AI neural networks and machine learning systems as elements of the infrastructure evolve.
- Lower latency is the result of strategically pooled persistent memory, improved CXL switching efficiency, and standardized memory management. Reduced latency is considered a key enabler of next generation use cases and future PCIe 6.0 adoption.
CXL Protocols and Standards
The release of the CXL 1.0 standard in 2019 was a significant milestone marked by CPU access to shared accelerator device memory. Compute express link protocols and standards have continued to improve and expand since this successful debut.
- CXL 1.1 improved compliance and interoperability aspects of the original standard while maintaining backwards compatibility with release 1.0.
- CXL 2.0 added switching capabilities for fan-out configurations, resource pooling, and persistent memory support while minimizing the need to overprovision resources. Link-level Integrity and Data Encryption (CXL IDE) were also incorporated to improve security.
- Sub-protocols developed for CXL specification 1.0 have remained consistent throughout the compute express link lifecycle:
- CXL.io is based on the PCIe 5.0 protocol and is used for discovery, configuration, and register access functions. CXL.io must be supported by all compute express link devices in order to function.
- CXL.cache manages interactions between the CPU (host device) and other compute express link enabled devices. This sub-protocol supports the efficient, low latency caching of host memory and direct device access to CPU memory using a request and response process.
- CXL.memory provides modes of access for the host to provision attached device memory using load and store commands. In this configuration, the CPU acts as a master with the compute express link device(s) acting as subordinates.
Impact of CXL on Storage
The heterogeneous, open computing model defined by the CXL protocol specification creates a more flexible storage landscape with efficient data movement contributing to lower latency and cost. Beyond the implicit value of pooled coherent memory, the reduction of proprietary memory interconnects compliments the diversity of emerging technology devices.
This shift in storage dynamics will also increase the size of cached memory pools, helping to meet the temporary storage needs of hyperscale data centers and other large computing enterprises. The additional capacity created through CXL memory pooling can be called upon as needed for high-volume workloads. This storage paradigm shift is consistent with the overall trend towards data center disaggregation and open architecture.
CXL and PCIe
PCI Express (PCIe) has become the de-facto high speed serial bus architecture over the past two decades, with point-to-point topology providing high speed links to connected devices. Despite the proficiency of PCIe for bulk data transfer, shortcomings become obvious in larger data center applications. Memory pools remain isolated from one another, which makes significant resource sharing nearly impossible and adds to the latency deficit for new connected devices.
- PCIe 5.0 is the latest backwards compatible generation of the Peripheral Component Interconnect Express standard. Released in 2019, the 5th generation included a requisite doubling of throughput along with support for alternate protocol deployment that is now being utilized by the CXL interface.
- Operating over the PCIe 5.0 physical layer, CXL protocols build upon the versatility of standard PCIe architecture by integrating new memory sharing functionality within the transaction layer.
- Standard PCIe devices and CXL software can be supported over the same link. A flexible processor port can quickly negotiate either standard PCIe or alternate protocol CXL interconnect transactions.
- PCIe 6.0 speed will continue with the doubling convention of previous generations. This makes the melding of device and system memory an essential consideration for latency reduction and accelerator performance.
CXL vs CCIX
Rather than acting as a direct rival technology to CXL, Cache Coherent Interconnect for Accelerators (CCIX®) takes an alternate approach to memory pooling.
- The CCIX Consortium: The stated mission of the CCIX Consortium is to develop and promote adoption of an industry standard specification to enable coherent interconnect technologies between general-purpose processors and acceleration devices for efficient heterogeneous computing.
- CCIX Specification: The open-source specification defines a peer-to-peer, symmetrical mode of memory coherence and has been developed in parallel (chronologically) with the CXL specification. Storage capacity from multiple devices is pooled together using non-uniform memory access architecture (NUMA). Both the host device and accelerator(s) must support CCIX operation.
- Alternate Protocols: CCIX and compute express link each leverage the PCIe 5.0 physical layer and support of alternate protocols. Using FlexBus connectivity, CXL technology defaults to the CPU for cache consistency, while CCIX does not establish a device hierarchy. This distinction has led to measurable speed and latency advantages for compute express link architecture.
VIAVI CXL Products
VIAVI Xgig Analyzer solutions for PCIe 5.0 support CXL.cache/memory transactions and triggers. The Xgig captures valuable real time metrics and performs detailed analytics over multiple traces simultaneously.
VIAVI PCIe 5.0 interposers like the Xgig PCIe 5.0 16-lane CEM Interposer can be used to capture CXL traffic running on a PCIe physical layer. The interposer creates a bi-directional interface between the protocol analyzer and system under test.
VIAVI and The CXL Consortium
The CXL Consortium is an open industry standard group formed to develop technical specifications that facilitate breakthrough performance for emerging usage models while supporting an open ecosystem for data center accelerators and other high-speed enhancements.
The Consortium was formally incorporated by the nine founding organizations in 2019 and now includes approximately 170 member companies. This includes 15 leading computer industry organizations that comprise the board of directors.
VIAVI is a proud member of the CXL Consortium, contributing decades of high-speed serial bus test and validation expertise to new working groups and specification development as the unique test requirements evolve. VIAVI initiated essential validation workflows for compute express link decodes in early 2020.
The History of Compute Express Link
CXL was first introduced to the world in March of 2019 with the release of the CXL 1.0 specification. Encompassing only minor updates, CXL 1.1 was released just three months later. The significant upgrade represented by version 2.0 in November of 2020 included link switching, link security, and other important features that directly addressed system performance and fast-tracked adoption. Hot-plug flows were also defined to add or remove resources reliably for applications like CXL over Ethernet.
- The first Compute Express Link capable products made commercially available were the Intel® family of Agilex™ FPGAs, in April of 2019.
- Additional CXL product introductions have included memory expansion modules by Samsung and processors from many leading manufacturers.
- Multiple Compute Express Link devices are currently in development, underscoring the need for reliable CXL protocol analysis and validation solutions.
- CXL 2.0 is available today. CXL 3.0, which will be adapted to the PCIe 6.0 physical layer, is expected to be released in 2022.