WP_Term Object
(
    [term_id] => 18278
    [name] => Codasip
    [slug] => codasip
    [term_group] => 0
    [term_taxonomy_id] => 18278
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 13
    [filter] => raw
    [cat_ID] => 18278
    [category_count] => 13
    [category_description] => 
    [cat_name] => Codasip
    [category_nicename] => codasip
    [category_parent] => 178
)
            
CHERI webinar banner
WP_Term Object
(
    [term_id] => 18278
    [name] => Codasip
    [slug] => codasip
    [term_group] => 0
    [term_taxonomy_id] => 18278
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 13
    [filter] => raw
    [cat_ID] => 18278
    [category_count] => 13
    [category_description] => 
    [cat_name] => Codasip
    [category_nicename] => codasip
    [category_parent] => 178
)

Extending RISC-V for accelerating FIR and median filters

Extending RISC-V for accelerating FIR and median filters
by Don Dingee on 09-05-2023 at 10:00 am

RISC-V presents a unique opportunity for designers to extend the microarchitecture with custom instructions. One possible application is digital signal filtering using finite impulse response (FIR) or median filters, potential algorithms for carrier demodulation schemes in communications systems like 5G. Codasip application teams studied the potential for accelerating FIR and median filters on their L31 RISC-V core, documenting their design process for extending RISC-V and the execution results in their latest white paper.

A brief overview of FIR and median filters

FIR and median filters seek to remove noise from an input signal using a set of N time-domain samples of the input. Neither filter uses internal feedback. A reasonable number of samples fit in a RISC-V register array, shifting on each sample clock with the oldest sample leaving the array and the newest one entering it at the other end.

The FIR filter draws its name from its finite time to settle to zero (if the input signal is zero for at least N consecutive samples). Samples in the order received are weighted by multiplication with filter coefficients, and a summation obtains the output. Using the standard RISC-V instruction set results in 2N-1 instructions, 2N memory loads, and approximately N comparisons and jumps depending on for-loop coding.

The median filter relies on sorting a sample set instead of multiplication. The sequence of sorted elements results in a median – the sample in the middle – taken as the output. Again, using standard RISC-V instructions, a sort usually takes about N·logN arithmetic comparisons and roughly the same number of memory operations.

Custom hardware blocks for FIR and median filters

 

 

 

 

 

Custom hardware blocks for FIR (left) and Median (right) digital filters. Source: Codasip

Codasip teams investigated the premise that custom instructions for each filter could speed up normal sample-by-sample execution once each FIFO fills with N samples – the “filtering window.” Coincidentally, accelerating FIR and median filters happens in three RISC-V clock cycles with different parallelized execution steps:

  • For the FIR filter, the first cycle fetches filter coefficients, the second multiplies and sums, and the third writes the result back to the register file.
  • For the median filter, the first cycle removes the oldest sample from the sort and shifts to close the gap, the second positions the newest sample and shifts bigger samples before inserting it, and the third pulls the median from the new sort.

Basics of extending RISC-V with CodAL

The starting point for the investigation is the stock Codasip L31 core, shown below with a block for custom extensions. In the white paper, Codasip describes creating four custom instructions for accelerating FIR and median filters: one setting up the FIR filter flow, one setting up the median filter flow, one setting FIR coefficients, and one clearing the FIFO.

Codasip L31 block diagram

 

L31 block diagram. Source: Codasip

Codasip’s CodAL language simplifies the description of processor features in a compact syntax similar to C++. Codasip Studio converts CodAL markup to RTL so designers can work in a more familiar high-level language. The “fir_push” instruction represented in CodAL is the simpler of the two examples; the white paper has a complete discussion on the more complex “med_push” instruction and constructing a cycle-accurate state machine model for both instructions.

FIR filter in CodAL

 

 

 

 

 

 

 

 

CodAL custom instruction implementation for the FIR filtering flow. Source: Codasip

Using CodAL, the FIR filter fits in about 150 lines of code, and the median filter is slightly larger at about 160 lines – entirely describing hardware resources and ready-to-simulate cycle-accurate instructions. By comparison, using Verilog, the FIR filter is 670 lines of code, and the median filter is 1180, without automatic compiler and simulator awareness.

Extending RISC-V for substantial PPA savings

Accelerating FIR and median filters is only part of an application, but the results show how it is possible to take on crucial routines by extending RISC-V microarchitecture. We’ll save the specifics for the white paper – in short, a performance increase of greater than 27x for either the FIR or median filter is seen in this approach, with only a 37% increase in L31 core area for the FIR, less for the median filter. Again, this is achievable without resorting to hand-coded RTL and dealing with compiler and simulator extensions – CodAL handles those details automatically.

To get the whole story, download the Codasip technical paper (registration access for full text):

Finite Impulse Response (FIR) and Median Filter accelerators in CodAL

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.