Today Xilinx announced SDAccel, an initiative for the data-center. This is the second of a series of software defined development initiatives for various markets, the first being SDNet that is targeted at building networking applications. One challenge that a company like Xilinx faces is that as the scale of design move up to entire systems, the more they are dealing with software engineers who know nothing much about hardware. They need a design methodology that is targeted to programmers who don’t even know what RTL is and have no intention of finding out.
SDAccel is aimed at building co-processors to accelerate certain functions in the data-center such as encryption, search, speech recognition, image recognition and so on. The basic architecture has a board of UltraScale FPGAs communicating with the server CPU cores via PCIe, and with access to memory too. The reason for doing this is that you can get 20-25X improvement in performance per watt compared to using the CPU or GPU directly. Also, 50-75X reduction in latency versus a pure software solution. Since power is the limiting factor in many data-centers this is significant.
But another feature is required to make this workable, which is that the workload is not fixed and neither are the algorithms. But the FPGAs are so big they contain multiple accelerators which means that run-time reconfigurability is required, meaning that some of the accelerators can be replaced on the FPGA at the same time as other accelerators are in use, and without taking down the always-on interfaces such as ethernet or PCIe. This is totally different from the normal use of an FPGA which is typically configured just once at system boot.
SDAccel consists of a software development environment for C, C++ and OpenCL. It has x86 emulation, hardware models and so on. Under the hood it uses Xilinx’s high level synthesis (HLS) and the Vivado place and route engine, but that is not talked about since software engineers don’t know anything about that and are scared off by being forced to learn too much about hardware. The idea is to give the engineer the same experience on FPGAs as they are used to in the CPU/GPU environments.
Xilinx has also worked with partners to create boards that just plug into the server PCIx slots, since it is not reasonably to expect data-center owners to design their own boards. So this is a complete solution. You plug the boards into the servers, use the software development environment to develop and analyze the accelerators, and then during operation they are dynamically loaded into the FPGAs on demand, rather like the paging in virtual memory.
The results are almost as good as hand-coded RTL which is obviously the golden benchmark. They are also 3-5X better than competitive FPGA OpenCL solutions (and we all know who competitive FPGA means).
So the summary is that it is a solution that is as easy to program as a CPU or GPU but with much much better performance per watt. SDAccel is available now.
More information is available here.
More articles by Paul McLellan…
Podcast EP267: The Broad Impact Weebit Nano’s ReRAM is having with Coby Hanoch