WP_Term Object
(
    [term_id] => 497
    [name] => Arteris
    [slug] => arteris
    [term_group] => 0
    [term_taxonomy_id] => 497
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 137
    [filter] => raw
    [cat_ID] => 497
    [category_count] => 137
    [category_description] => 
    [cat_name] => Arteris
    [category_nicename] => arteris
    [category_parent] => 178
)
            
Arteris logo bk org rgb
WP_Term Object
(
    [term_id] => 497
    [name] => Arteris
    [slug] => arteris
    [term_group] => 0
    [term_taxonomy_id] => 497
    [taxonomy] => category
    [description] => 
    [parent] => 178
    [count] => 137
    [filter] => raw
    [cat_ID] => 497
    [category_count] => 137
    [category_description] => 
    [cat_name] => Arteris
    [category_nicename] => arteris
    [category_parent] => 178
)

Managing Service Level Risk in SoC Design

Managing Service Level Risk in SoC Design
by Bernard Murphy on 06-21-2023 at 6:00 am

Discussion on design metrics tends to revolve around power, performance, safety, and security. All of these are important, but there is an additional performance objective a product must meet defined by a minimum service level agreement (SLA). A printer display may work fine most of the time yet will intermittently corrupt the display. Or the nav system in your car intermittently fails to signal an upcoming turn until after you pass the turn. These are traffic (data) related problems. Conventional performance metrics only ensure that the system will perform as expected under ideal conditions; SLA metrics set a minimum performance expectation within specified traffic bounds. OEMs ultimately care about SLAs, not STAs. Meeting/defining an SLA is governed by interconnect design and operation.

Managing Service Level Risk in SoC Design

What separates SLA from ideal performance?

Ideally, each component could operate at peak performance, but they share a common interconnect, limiting simultaneous traffic. Each component in the design has a spec for throughput and latency – perhaps initially frames/second for computer vision, AI recognition, and a DDR interface, mapping through to gigabytes/second and clock cycles or milliseconds in a spreadsheet. An architect’s goal is to compose these into system bandwidths and latencies through the interconnect, given expected use cases and the target SLA.

Different functions generally don’t need to be running as fast as possible at the same time; between use cases and the SLA, an architect can determine how much she may need to throttle bandwidths and introduce delays to ensure smooth total throughput with limited stalling. That analysis triggers tradeoffs between interconnect architecture and SLA objectives. Adding more physical paths through the interconnect may allow for faster throughput in some cases while increasing device area. Ultimately the architect settles on a compromise defining a deliverable SLA – a baseline to support a minimum service level while staying within PPA goals. This step is a necessary precursor but not sufficient to define an SLA; that step still needs to factor in potential traffic.

Planning for unpredictable traffic

Why not run simulations with realistic use cases? You will certainly do that for other reasons, but ultimately, such simulations will barely scratch the surface of SLA testing across an infinite range of possibilities. More useful is to run SystemC simulations of the interconnect with synthetic initiators and targets. These don’t need to be realistic traffic models for the application, just good enough to mimic challenging loads. According to Andy Nightingale (VP of product marketing at Arteris), you then turn all the dials up to some agreed level and run. The goal is to understand and tune how the network performs when heavily loaded.

An SLA will define incoming and outgoing traffic through minimum and maximum rates, also allowing for streams which may burst above maximum limits for short periods. The SLA will typically distinguish different classes of service, with different expectations for bandwidth-sensitive and latency-sensitive traffic. Between in-house experience in the capabilities of the endpoint IP together with simulations the architect should be able to center an optimum topology for the interconnect.

The next step is to support dynamic adaptation to traffic demands. In a NoC, like FlexNoC from Arteris, both the network interface units (NIUs) connecting endpoint IPs and the switches in the interconnect are programmable, allowing arbitration to dynamically adjust to serve varying demands. A higher-priority packet might be pushed ahead of a lower-priority packet or routed through a different path if the topology allows for that option, or a path might be reserved exclusively for a certain class of traffic. Other techniques are also possible, for example, adding pressure or sharing a link to selectively allow high priority low-latency packets to move through the system faster.

It is impossible to design to guarantee continued high performance under excessive or burst traffic, say, a relentless stream of video demands. To handle such cases, the architect can add regulators to gate demand, allowing other functions to continue to operate in parallel at some acceptable level (again, defined by the SLA).

In summary, while timing closure for ideal performance is still important, OEMs care about SLAs. Meeting those expectations must be controlled through interconnect design and programming. Arteris and their customers have been refining the necessary Quality of Service (QoS) capabilities offered in their FlexNoC product line for many years. You can learn more HERE.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.