Several companies have attacked the QoS problem in SoC design, and what is emerging from that conversation is the best approach may be several approaches combined in a hybrid QoS solution. At the recent Linley Group Mobile Conference, NetSpeed Systems outlined just such a solution with an unexpected plot twist in synthesis.
The QoS picture isn’t as simple as it looks; there are more factors than slotting traffic in some priority scheme where higher priority stuff moves through the system with less blocking. NetSpeed’s Joe Rowlands called this “lossy” information transfer, where local decisions on traffic patterns might solve a localized problem but don’t necessarily help overall system performance.
Let’s separate out the fact that IP blocks tend to speak different QoS languages – the case for using a network-on-chip in abstracting QoS in the first place. Complexity versus priority becomes more obvious in this diagram:
It’s interesting how these were classified. The difference between “variable” and “dynamic” is a difference between data traffic and user interaction. Also, the assignment of the GPU to “low” makes an assumption that it has enough memory bandwidth and typically outruns most of its tasks. And, putting the camera in “real-time” is a distinction – as with any pixel processing engine, some latency to get it started is OK, but once it is rolling operations have to proceed deterministically, otherwise there are unacceptable gaps in the output.
Overlay on top of that diagram steps for power optimization and the problem of sequencing agents, and the issues of cache coherency and memory control. NetSpeed uses what they call a layered SoC interconnect synthesis solution, a lot of words for multiple approaches working together to solve different aspects of the problem. There are two key elements of their solution: Pegasus, a “last level cache” block that can serve as traditional memory cache or configurable cache at other points in the network; and Gemini, their coherent NoC IP.
With multiple cache controllers and specialized accelerators, Gemini offers massive configurability in a formally proven, deadlock-free interconnect. How should a NoC be configured? NetSpeed has deployed machine learning algorithms to the NoC synthesis problem to set their router topology and link width, virtual channel, and buffer sizes. Instead of setting QoS only on a per-router basis, bandwidth is allocated at the system level for end-to-end QoS, accounting for cases with low power modes.
The results of traffic-based adaptability, power control, and cache configurability combined with the machine learning router configuration yield solid results. (Comparing results between competing NoC implementations is nearly impossible unless one were to implement the same complex SoC in every variant. Even then, different optimization strategies would produce different results – see the Apple A9 dual sourcing conversation for a case in point.) NetSpeed offers a chart from what they say is a tier 1 mobile OEM using manual bandwidth tuning versus the automated NetSpeed synthesis. It is clear NetSpeed’s automation outperforms on bandwidth in every use case considered, sometimes dramatically.
NetSpeed also claims a development time advantage, and I think that takes into account what would likely be multiple trial-and-error runs in manual iterations. No info was provided on how much simulation goes into the machine learning optimization process, but even a significant up-front simulation effort would appear to be worth the wait. There’s more discussion on what they simulate on the NetSpeed Gemini product page.
I’m hearing this story more and more often – EDA tools are infringing on system-level expertise with automation of very complex design optimization problems. It’s a classic make-versus-buy conflict, where years of experience might seem threatened by adopting a tool that might do things better. But, the bottom line in this new industry environment probably isn’t the huge mobile SoC design pursued by a small army of specialists. The target for these types of tools may be mid-range SoC designs in areas like the IoT where the years of SoC design experience isn’t built into the organization. Tools like NetSpeed will help teams with moderate system-level experience get better optimized chips designed faster.
Share this post via:
TSMC Unveils the World’s Most Advanced Logic Technology at IEDM