WP_Term Object
(
    [term_id] => 14
    [name] => Synopsys
    [slug] => synopsys
    [term_group] => 0
    [term_taxonomy_id] => 14
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 699
    [filter] => raw
    [cat_ID] => 14
    [category_count] => 699
    [category_description] => 
    [cat_name] => Synopsys
    [category_nicename] => synopsys
    [category_parent] => 157
)

DSP Benchmarks and Libraries for ARC DSP families

DSP Benchmarks and Libraries for ARC DSP families
by Eric Esteve on 12-13-2017 at 12:00 pm

Synopsys DesignWareARC HS4xD family is a perfect example of high performance DSP, enhanced RISC CPU IP core, able to address high-end IoT, mid to high-end audio or baseband control. ARC HS4xD architecture is 10-stage pipeline for high Fmax, resulting in excellent RISC efficiency with 5.2 CoreMark per MHz. ARC EMxD processors are offering lowest energy DSP solution to address sensor processing or data fusion, voice & speech, audio or low bit-rate communication. The shallow pipeline architecture has a direct impact on area and power, both optimized, while keeping very good RISC efficiency with 4.02 CoreMark per MHz. We will review the various DSP capabilities of Synopsys DesignWare ARC DSP oriented RISC processors: EM5D & EM7D, EM9D & EM11D and HS45D & HS47D.

20827-complete-set-dsp-features-min.jpg

The complete DSP features are listed on the above table for the various family cores. These cores are fixed-point by default, but all offer floating-point operation as an option. ARC HS45D/47D exhibits more muscle with 4×16 SIMD operation (instead of 2×16 for EMxD) and 64-bit load/store (instead of 32-bit for EMxD).

ARC HS45D/47D cores are supporting dual-issue, increasing utilization of functional units with limited amount of extra hardware. What is dual-issue? The capability for up to two instructions per clock, with in-order execution and the same software view as single issue. Dual-issue increases both RISC and DSP performance, with very decent area and power penalty, with only 15% increase. The instruction set has been improved to increase instruction per clock, allowing to execute multiple instructions in parallel and take benefit of the dual-issue pipeline.

Instruction Set Architecture (ISA) is ARCv2DSP, designed for high DSP core efficiency and small code size. Synopsys is showing benchmarks (cycle count ratio) for several DSP functions (vector functions, complex maths, scalar maths, matrix functions, IIR filters, FIR filters, Interpolation and transforms), comparing ARC EM9D, ARC EM5D and Processor Y. Core efficiency of ARC EM9D is always better than for ARC EM5D, and better than the competition with a ratio lower by 40% in average.

ARCv2DSP is including 4×8, 2×16, 4×16, and 2×32 SIMD, 16+16 complex MPY/MAC, FFT butterfly instructions, and ITU support for voice. Because HS4xD implements same instructions as EMxD, that provide consistency to ARC portfolio and ease of software portability. HS4xD implementing 64-bit source operations, allows 4×16 and 2×32 SIMD instructions for improved DSP efficiency.


20827-complete-set-dsp-features-min.jpg

When combining RISC and DSP capabilities in a processor, the key is the software tools and library support, allowing seamless C/C++ programming and debug. DSP software development is made easy, thanks to enhanced C/C++ DSP compiler and rich DSP software library. The C/C++ DSP compiler offers DSP support with fractional data-types and fixed-points API, as well as LLVM-based with excellent performance and code density.

But it would not be easy to develop DSP software without rich DSP software library. Synopsys propose highly optimized set of common DSP functions, vector, matrix, filters (FIR & IIR), transforms (FFT) or interpolation and fixed-points DSP for Q15 and Q31 data-types. DSP libraries also includes ITU-T base operations library for voice codecs, C++ class library for fast emulation on x86 platform. To support efficient simulation, Synopsys provides nSIM bit-accurate DSP models and HW targets for emulation: nSIM/xCAM, RTL, FPGA and Silicon.

The next picture list the DSP functions supported by ARC DSP family:


20827-complete-set-dsp-features-min.jpg


Every CPU/DSP IP vendor will claim offering the best solution, that’s why it could be wise to look at verified facts when comparing with the competition. Synopsys has compiled benchmarks, comparing cycle count ratios of Cortex-M4, Cortex-M7, Cortex-A8 and ARC EM9D:

20827-complete-set-dsp-features-min.jpg

This blog has been extracted from a presentation made during ARC Processor summit “Programming DSP Processors Efficiently and with Ease” done by Pieter van der Wolf, Principal Product Architect, Synopsys. Abstract: “The ARC EM and HS processor families both offer processors with the ARCv2DSP ISA extension to support a wide range of DSP applications. These DSP processors come with an advanced tool suite, including a powerful DSP compiler, to support C-level programming of DSP applications. In this presentation we show how excellent results, in terms of high performance and small code size, can be achieved with high-level DSP programming. Ease of programming is also supported with an extensive library of DSP functions. The high-level programming enables software compatibility across different ARC DSP processors.

By Eric Esteve from IPnest

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.