Xilinx has delivered not only “the biggest FPGA on the planet”, but what it claims is currently the world’s largest integrated circuit: the Virtex UltraScale VU440, with 19 billion transistors fabbed in TSMC 20nm. The list of first customers to receive parts says a lot about the state of SoC design today, and the vital role FPGA-based prototyping and hardware-aware synthesis plays.
Fabs love massive FPGAs to prove out sophisticated process nodes. Around the periphery, there is certainly some magic with high-speed SERDES transceivers and other interfaces. The interior is where scale lives. Laying down a sea of equally sized logic cells in an interconnect fabric is ideal territory for fabs to show their process can maintain planar consistency across space and environmental variables.
The VU440 has 4,432,680 of those logic cells, along with 88.6Mb of block RAM, 1404 single-ended HP and 52 single-ended HR I/Os, three 100G Ethernet ports, and 48 GTH 16.3 Gbs transceivers among other features. Its package is a marvel unto itself: a 55x55mm flip chip BGA with 2892 pins, with 3D techniques including stacked silicon and 600,000 micro bumps. Getting that all in a part, and getting that part reliably attached to any board, is just plain amazing.
What does one do with something this big? Xilinx cites some top-of-pyramid applications: digital array radar, LTE-A wireless, 8K Ultra HD broadcast video, and 1Tb/sec optical transport networks. Performance is welcome, however not all of these applications need this many logic cells. Currently, six smaller UltraScale parts with similar features fill those requirements just fine. Single board computing vendors, who used to be the prove-out platform for large FPGAs, are probably going to be content at smaller capacities and lower price points.
Where capacity still can outstrip FPGA state-of-the-art is SoC design. 4M logic cells translates to roughly 50M ASIC gates. This fits a lot of things. In an introductory video, Xilinx shows just such an example: a Xilinx AFX board with a single VU440 holding a cluster of 10 ARM Cortex-A9 cores. It is filled to the brim: 94% of CLBs, and 77% of LUTs are used. The ten CPU cores are running at 50 MHz. While a demo system, it illustrates the potential.
Among other improvements, Xilinx spent a lot of time completely revising the clocking scheme in UltraScale, making it more ASIC-like with respect to how slice boundaries are dealt with. They also improved routing the routing architecture and logic cell packing. Even with these improvements and the massive resources of the VU440, current SoC designs are often far bigger than 50M gates, and RTL has to be partitioned across multiple FPGAs.
Synopsys was among the first to receive VU440 parts, and is working on an even bigger version of the HAPS prototyping platform. As we explored in a recent post on prototyping the Imagination PowerVR Series6XT, having a big enough gate pile to hold logical partitions of a design is the start. Synopsys ProtoCompiler performs hardware-aware synthesis, using constraints defining FPGA and board-level resources such as interconnect and multiplexing. When they get their arms around the UltraScale architecture, and leverage the VU440 capability fully in synthesis, designers will have incredible capability.
Of course, there is still the old school of “manual” tool-assisted partitioning. Some designs are cleanly separable. Capability of Xilinx Vivado Design Suite for tasks like partitioning continues to improve. Some designers like the control, and the challenge. Another FPGA-based prototyping system vendor, The Dini Group, also has VU440 parts in house. In an homage to just how far we have come in being able to cut large and unruly ASIC designs into manageable pieces, they have proudly dubbed their newest prototyping engine as “Godzilla’s Butcher on Steroids.”
For more information on the VU440, and to launch the full video, see the Xilinx press release:
Scalability, as the name implies, is central to the Xilinx UltraScale strategy. The shipment of the VU440 is a stunning accomplishment, one likely to be unmatched for a while. We seem to have arrived at a point where the practice of FPGA-based prototyping is ready for prime time. The costs of committing to silicon without complete hardware and software co-verification, and rapid changes leading to retesting, are too big to risk. The capacity of the VU440 applied to FPGA-based prototyping should bring in more developers.
Share this post via:
If you believe in Hobbits you can believe in Rapidus