The internet keeps adding users and connected devices. According to the numbers in a white paper from Achronix, by 2022 there will be 4.8 billion internet users and 28.5 billion connected devices. Internet traffic will reach 275 exabytes per month. Of this a staggering 83 percent will be video traffic. Moving the data from creators to consumers, video editing and processing of video data for applications using machine learning requires large amounts of video processing. The Achronix white paper, titled “FFPGAs for Advanced Video Processing Solutions”, examines each of the above tasks and the type of data processing required.
We have all seen the message “Processing your video” when uploading videos to YouTube or Facebook. The conversion of video in one format to another for use on other platforms and devices is critical a step for sharing content. Originally this work was done on CPUs or sometimes using GPUs. While ASICs might also be an attractive processing solution, they are limited when it comes to the proprietary compression methods that are often used in transcoding.
In the case of video editing and content creation, desktop computers with GPU acceleration have been a mainstay. However, with the dawn of 4K and 8K video, these platforms are underpowered for the task. This work has been moving to the cloud, however using traditional processors in the cloud has its limits.
Lastly, AI applications need images not video to perform inference. This entails converting H.264 or H.265 video streams into JPEG or PNG images that can be used by the AI processors. The conversion to an image file may also include converting image resolution or other processing to help the AI application.
Achronix makes the case that FPGAs, especially their Speedster7t, are well suited to all of these tasks. Both GPUs and FPGAs offer parallel processing, but FPGAs often come up as the preferred choice because of their power advantage over GPUs.
The Achronix white paper looks at each type of activity to analyze the effectiveness of their Speedster7t FPGA. When streaming and transcoding H.264 video many of the tasks are easily handled by CPUs. Yet, one task in the process, motion estimation, has been profiled to use around 21% of the entire processing load on CPUs. This is a task that can be moved to an FPGA for a big improvement in throughput.
Whether you are talking about working with RAW video data or compressing video using intra-frame structure, video editing and content creation have become unwieldy at resolutions such as 4K and 8K. Previously with HD and 2K video using CPUs was a feasible approach. The white paper includes benchmark data that supports the notion that CPUs must be supplanted at today’s higher resolutions.
For AI, there is a lot to be gained by combining the video decoder and the image encoder in the same processing unit. Frequently there is also a need for additional image processing as a prerequisite to the inference step. This too can easily be accommodated in an FPGA.
Achronix then moves to a discussion of the specific advantages found in their Speedster7t family. Their 2D Network on Chip (NoC) facilitates high speed transfers between the external interfaces in the Speedster7t FPGA and blocks in the FPGA fabric. It also provides rapid transfers among functional blocks on-chip. Because it is separate from the FPGA fabric, no FPGA resources are consumed when setting up pathways for data exchange. Likewise, because it uses a high-level protocol, FPGA designers do not need to put together routing and buffering logic. To transfer data a user or consumer only need to connect to a Network Access Point.
Speedster7t FPGAs come with a well thought out set of interfaces. The Speedster7t AC7t1500, for instance, offers fracturable Ethernet controllers (supporting rates up to 400G), PCI Gen 5 ports and up to 32 SerDes channels with speeds up to 112 Gbps. It also has multi-channel GDDR6 memory interfaces. With the NoC running at a much higher speed than the clocks usually associated with FPGA fabrics, it can transport in aggregate over 20 Tbps. The combination of the NoC and the high-speed interfaces means it is in a class by itself when it comes to meeting the needs of video processing.
The paper finishes with a discussion of the Machine Learning Processors (MLP) that Achronix has developed for use in the Speedster7t family. It is interesting reading about how the MLPs are optimized with local block RAM and math units that handle MAC operations needed for AI. Achronix has consistently been adding features for a wide range of complex applications to their FPGAs. Their white papers, such as this one, frequently make compelling cases for the use of their technology in system design. The full white paper on video processing is available on their website.
Share this post via: