In our fascination where architecture meets the ideas of Fourier, Nyquist, Reed, Shannon, and others, we almost missed the shift – most digital signal processing isn’t happening on a big piece of silicon called a DSP anymore.
It didn’t start out that way. General purpose CPUs, which can do almost anything given enough code, time, power, and space, were exposed as less than optimal for DSP tasks with real-world embedded constraints. In order for algorithms to thrive in real-time applications, some kind of hardware acceleration was needed.
The DSP-as-a-chip emerged, with tailored pipelining and addressing modes wrapped around multiply-accumulate stages, and in more modern implementations larger word widths and parallelism. Popular general purpose DSP families from Analog Devices, Freescale, TI, and others still exist today, making up about 8% of market revenue according to Will Strauss.
What happened? As DSP became part of more systems, technology diverged targeting specific portions of a system with its capability, in the mix with other more general purpose resources. Four other methods enabling signal and image processing algorithms appeared:
For every divergence, there is a convergence. Today, flexibility for more than one application is the name of the game, and that is breaking the boundaries between device types. GPUs are morphing into more than just graphics engines, CPUs want to do some DSP algorithms, and DSPs and FPGAs both crave partner cores for more general purpose work.
This is giving rise to new combinations of general purpose cores and DSP capability for acceleration of key functions. Looking at recent multicore developments – TI KeyStone, Xilinx Zynq, NVIDIA Tegra K1 to name a few – the trend is becoming obvious. By no means does this imply these parts are exactly interchangeable, just that the trend is headed away from the traditional DSP-as-a-chip toward a multicore blend of functions.
So, it shouldn’t be a surprise these influences are also changing how DSP IP cores are evolving, getting beyond specialized point functions such as audio and baseband interface. By definition, a DSP IP core sits astride an ARM or other processor core, fitting into the trend we’ve identified. This brings opportunities in interconnect and cache coherency, along with new possibilities.
In a marked departure from the traditional DSP architecture, CEVA has uncorked the XC4500, with features borrowed from almost all the approaches we’ve talked about converging in a single part. Paul McLellan introduced us to the XC4500 last fall, but I’ll mention two items briefly. First is a vector processing element, able to rip through over 400 16-bit operations in a single cycle. Second is the interface between the vector engine and several CEVA-defined plus open to user-defined co-processors, which CEVA terms “tightly coupled extensions”.
It’s a huge jump from DSP point functions in mobile handsets into a crowded field of wireless infrastructure solutions. Will CEVA succeed here? We should keep in mind the Internet of Things is driving us into new territory: software defined everything. Just as the DSP-on-a-chip is no longer the entire processor, the radio is now no longer the entire product. Efficient operation in the space between subscribers and the cloud is going to require a lot more than just protocol engines and baseband processing, and the workload-tuned CEVA XC4500 is another good example of processor evolution.
My guess is what we will see from CEVA and others is a learning cycle or two, where these new DSP architectures continue to evolve, and new application ideas emerge as the right combinations of features and ways for partner cores to use them are discovered. Designers will have to get used to multiple, formerly separate disciplines of thinking – DSP plus vector engine plus ARM core, all tied together via software, being a good example – and how to best partition and coordinate software to achieve system goals.
At the spot software defines everything, the new DSP convergence will probably be found.