We have learned from nature that two characteristics are helpful for success, diversity and adaptability. The same has been shown to be true for computing systems. Things have come a long way from when CPU centric computing was the only choice. Much heavy lifting these days is done by GPUs, ASICs, and FPGAs, with CPUs in a support and coordination role. This is happening in applications such as networking, big data, machine learning and elsewhere. Naturally, edge and cloud data center operators now have numerous choices about which hardware to use to fill their racks. When buying hardware they must anticipate the kinds of workloads that will be handled and even where they will be run. A wrong choice can mean wasted money and resources. What’s needed is processing that is adaptable, can scale and meet diverse and changing workloads.
An emerging trend to address these new workloads is use of distributed accelerator cards. In many cases they fit the power dissipation requirements for their target data centers. They offer scalability to meet rapidly growing demand. They also can incorporate high bandwidth connectivity to ensure that throughput is not limited. Accelerator cards can be fitted with a wide variety of computing engines. However, FPGAs seem to have many desirable characteristics, making them more appealing.
FPGAs are extremely adaptable because they can be reprogrammed as workloads change. They offer extremely high parallelism, which is often necessary for the tasks they are applied to. So, it might seem that the problem is solved – that’s all there is to it. However, the specific architecture of the FPGA and the details of the accelerator card it is placed on make a big difference.
In a recent white paper, Achronix makes the case that there are several aspects of the FPGA and accelerator card architecture that determine how well an accelerator card can perform in demanding applications. They point to the features in the VectorPath S7t-VG6 accelerator card recently released by Achronix and BittWare that uses the Achronix Speedster 7t FPGA. BittWare, a Molex company, has a long history of producing FPGA based accelerator cards. This particular card comes well equipped with 8GB of GDDR6 memory that can operate at 4 Tbps. It also has 4 GB of DDR4. It offers 400GbE and 200GbE Ethernet, as well as PCIe Gen3 x16 that can be upgraded to Gen4 and Gen5.
However, there are some really interesting features in the Speedster 7t that give this accelerator card a significant edge. It contains machine learning processors (MLP) that are optimized for machine learning applications. The MLPs can perform up to 32 multiply/accumulate operations per cycle. Another interesting addition is a 2D Network on Chip (NoC) that supports data movement at 2GHz between the external IO interfaces, FPGA fabric, external GDDR6 memory interfaces and MLPs. A big advantage of this is elimination of the need to use precious FPGA gates to manage data flow to and from high speed interfaces. The NoC handles this, freeing up more of the FPGA core for application related uses. There are also direct clock and GPIO interfaces, as well as OCuLink, to provide the ability to combine accelerator cards or interface with legacy equipment.
For customers who want a turnkey server solution, BittWare even has ready-to-go servers with up to 16 VectorPath PCIe cards on Dell or HPE servers with development software to allow customers to rapidly deploy this new technology. The white paper also hints strongly at a forthcoming IP offering of the Speedster 7t FPGA fabric, which would allow customers to build their own ASIC based accelerators.
The Achronix white paper makes interesting reading. It includes a summary of the accelerator board market, and its future growth potential. It also dives into the specifics of the BittWare offering and the details of the Acchronix Speedster 7t FPGA. I suggest going to the Achronix website to download this interesting document.