WP_Term Object
(
    [term_id] => 15
    [name] => Cadence
    [slug] => cadence
    [term_group] => 0
    [term_taxonomy_id] => 15
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 617
    [filter] => raw
    [cat_ID] => 15
    [category_count] => 617
    [category_description] => 
    [cat_name] => Cadence
    [category_nicename] => cadence
    [category_parent] => 157
)
            
14173 SemiWiki Banner 800x1001
WP_Term Object
(
    [term_id] => 15
    [name] => Cadence
    [slug] => cadence
    [term_group] => 0
    [term_taxonomy_id] => 15
    [taxonomy] => category
    [description] => 
    [parent] => 157
    [count] => 617
    [filter] => raw
    [cat_ID] => 15
    [category_count] => 617
    [category_description] => 
    [cat_name] => Cadence
    [category_nicename] => cadence
    [category_parent] => 157
)

A Novel Approach to Future Proofing AI Hardware

A Novel Approach to Future Proofing AI Hardware
by Bernard Murphy on 06-11-2025 at 6:00 am

There is a built-in challenge for edge AI intended for long time-in-service markets. Automotive applications are the obvious example, while aerospace and perhaps medical usage may impose similar demands. Support for the advanced AI methods we now expect – transformers, physical and agentic AI – is not feasible without dedicated hardware acceleration. According to one report, over 50% of worldwide dollars spent on edge AI by 2033 will go to edge hardware rather than software, cloud or services. The challenge is that AI technologies are still evolving rapidly yet hardware capability once built is baked in; it cannot be upgraded on the fly like software. However AI must be upgradable through the 15-year life of a typical car to continue to align with regulatory changes and not drift too far from user expectations for safety/security, economy, and resale value. Reconciling these conflicting needs is placing a major emphasis on future-proofing AI hardware – anticipating and supporting (to the greatest extent possible) the AI advances that we can imagine. Cadence has a novel answer to that need, a co-processor to support an edge NPU and handle offload for (non-NPU) AI tasks that are known and tasks that are not yet known.

A Novel Approach to Future Proofing AI Hardware

How can you future-proof hardware?

CPUs can compute anything that is computable, making them perfect for general-purpose computing but inefficient for matrix or vector arithmetic as measured by performance and power consumption. NPUs excel at solving the large linear arithmetic problems common in transformers in attention and multi-layer perceptron operations, but not the non-linear operations required in AI for functions like activation and normalization, or custom vector or matrix operations not built into the NPU instruction set architecture (ISA). These operations are generally handled through specialized hardware.

Unfortunately, AI model evolution isn’t limited to the core functions represented in say the ONNX operation set. New operations continue to appear, sometimes for fusion, sometimes for complex algorithms, perhaps for agentic flows, operations which can’t be mapped easily onto the NPU. A workaround is to offload such operations to a master CPU as needed, but there can be a big performance (and power) overhead where repeated offloads are necessary, potentially very damaging to ultimate product performance.

Much better is a programmable embedded co-processor which can be tightly coupled to the NPU. Programmable so it provides full flexibility to add new operations, and tightly coupled to minimize overhead in offloading between the NPU and the co-processor. That is the objective of the Cadence NeuroEdge AI Co-Processor – to sit right next to one or more NPUs, delivering low latency non-NPU computation as a complement to NPU computation.

The Tensilica NeuroEdge 130 AI Co-Processor

This co-processor approach is already proven in leading chips from companies like NVIDIA, Qualcomm, Google and Intel. What makes the Cadence solution stand out is that in embedded applications you can pair it with your favorite NPU (later with an array of NPUs). You can build as tight an NPU subsystem as you’d like between the NeuroEdge Co-Processor and your NPU of choice, perhaps with shared memory, to deliver the differentiation you need.

Also of note, this IP builds on the mature Cadence Tensilica Vision DSP architecture and software stack, so it is available today. What Cadence has done here is interesting. They trimmed back the Vision DSP architecture to focus just on AI support, reminiscent of how NVIDIA transitioned their general-purpose GPU architecture to AI platforms. This reduction results in a co-Processor with 30% smaller area for a similar configuration and 20% lower power consumption for equivalent workloads, while maintaining the same performance. Designers can also use the proven NeuroWeave software stack for rapid development. Connection to your NPU of choice is through the NeuroConnect API across AXI (or a high bandwidth direct interface). Cadence adds that this core is ISO26262 FUSA-ready for automotive applications.

One more interesting point: A common configuration would have NeuroEdge acting as co-processor supporting an NPU as the primary interface. Alternatively, NeuroEdge can act as the primary interface controlling access to the NPU. I imagine this second approach might be common in agentic architectures.

My Takeaways

This is a clever idea: a ready-made, fully programmable IP to accelerate non-NPU functions you haven’t yet anticipated, while working in close partnership with an NPU. You could try building the same thing yourself, but then you have to figure out how to tie a vector accelerator (DSP) and a CPU (with floating point support) tightly around your NPU, then run exhaustive testing on your new architecture, and also rework your software stack, etc. Or you could build around a proven and complete complement to your NPU.

Can this anticipate every possible evolution of AI models over the next 15+ years? No one can foresee every twist and turn in edge AI evolution, but a tightly coupled co-processor with fully programmable support for custom vector and scalar operations seems like a pretty good start.

You can learn more about the Cadence Tensilica NeuroEdge 130 AI Co-Processor HERE.

Also Read:

Cadence at the 2025 Design Automation Conference

Anirudh Fireside Chats with Jensen and Lip-Bu at CadenceLIVE 2025

Anirudh Keynote at CadenceLIVE 2025 Reveals Millennium M2000

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.