WP_Term Object
    [term_id] => 13
    [name] => ARM
    [slug] => arm
    [term_group] => 0
    [term_taxonomy_id] => 13
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 363
    [filter] => raw
    [cat_ID] => 13
    [category_count] => 363
    [category_description] => 
    [cat_name] => ARM
    [category_nicename] => arm
    [category_parent] => 178

HLS with ARM and FPGA Technologies Boosts SoC Performance

HLS with ARM and FPGA Technologies Boosts SoC Performance
by Pawan Fangaria on 11-23-2015 at 7:00 am

The way SoC size and complexity are increasing; new ways of development and verification are also evolving with innovative automated tools and environment for SoC development and optimization. IP based SoC development methodology has proved to be the most efficient for large SoCs. This needs collaboration among multiple players including IP developers, SoC vendors, EDA tool providers, foundries, FPGA providers, and others.

ARM connected community has more than 1200 partners and ARM TechConis one of the best forums to learn about new innovations in IP development and SoC integration. Although I couldn’t attend the conference, I came across a presentation made by Hardenton how to boost performance from ‘C’ software to extremely high (Sky-high) level with hardware acceleration. The methodology uses SoC with ARM’s ACP (Accelerator Coherency Port) and ACE (AXI Coherency Extension) interfaces and Xilinx FPGA technologies.

The hard-IP acceleration is targeted towards particular applications and comes from co-processors and accelerators fixed in silicon. The soft-IP acceleration is more generic in nature and is scalable; achieved through programmable logic customized according to the application need. Both hard and soft IP are needed to optimize the SoC.

Above is an example of code annotation in ‘C’ program which can direct the HLS (High-Level Synthesis) tool to synthesize the ‘for’ loop into pipeline architecture. The pipelining increases throughput and resource utilization, thus enhancing performance of the function. Similarly, there are various types of memories that balance between throughput and capacity. Appropriate memory is used for an application to interface between software and hardware.

At the top level, the system looks like a combination of processing system, programmable logic, and an interface between them that can be best implemented with ARM’s AMBA (Advanced Microcontroller Bus Architecture) for efficient data movement. An important consideration for high-performance and high-throughput data movement is lower latency with coherent interfaces.

This is an example of a soft-IP directly interfacing with cache memory through AMBA cache coherent interfaces, ACP or ACE. The benefit of using coherent interfaces to access data in cache is limited by the capacity of the cache sub-system. Regular (non-cache-coherent) AXI interfaces will still provide comparable latencies for sufficiently large data sets. Several mechanisms are used to avoid cache misses and also to save power. The function for acceleration must be chosen carefully that can provide performance gain in data processing as against bottleneck in data movement.

The above procedure is described in most simplistic manner. In practice it requires a lot of hardware as well as software expertise. Availability of ARM processors and interfaces, HLS tools, and tools for partitioning software and hardware has reduced the development effort by a large extent. Yet there are other pieces such as drivers and other hardware to handle data movement that need to be integrated to complete the SoC.

Xilinxhas a new SDSoC development environment that can be used to optimize and deploy programmable SoCs much easily. The SDSoC front-end is an Eclipse based C/C++ IDE and the back-end can call upon many hardware design tools.

Hardent recommends this flow using SDSoC to quickly optimize the custom hardware components. The application can run on any ARMCortex-A/R processor; both bare-metal and Linux applications are supported. The profiling tool integrated into the IDE interfaces with non-invasive ARM debug components built into an SoC. ARM CoreSight technology provides excellent debug and trace system. The hardware estimates can be optimized through iteration over micro-architecture and macro-architecture.

This flow provides an environment in which a complete system can be described in C/C++, migrate appropriate functions to soft-IP by using HLS, and integrate the soft-IP into the system. The processor and the programmable logic are tightly integrated with AMBA interfaces into the SoC.

Hardent is an active member of VESA (Video Electronics Standard Association) and MIPI (Mobile Industry Processor Interface) Alliance and provides IP for display. Hardent also provides several training courses for SoC development based on ARM processors, Xilinx Zynq and HLS. The latest in the offering is “Embedded C/C++ SDSoC Development Environment and Methodology”.

The SDSoC appears to be an excellent, efficient environment for SoC development and optimization. See this, less than 4 minutes video at Xilinx website HERE.

Pawan Kumar Fangaria
Founder & President at www.fangarias.com


There are no comments yet.

You must register or log in to view/post comments.